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Abstract 

This  dissertation  describes  research  into  image  processing  techniques  that  en¬ 
hance  military  operational  and  support  activities.  The  research  extends  existing 
work  on  image  registration  by  introducing  a  novel  method  that  exploits  local  correla¬ 
tions  to  improve  the  performance  of  projection-based  image  registration  algorithms. 
The  algorithm  is  shown  to  operate  in  low  signal-to-noise  ratio  (SNR)  conditions 
and  to  significantly  improve  registration  performance  by  as  much  as  a  factor  of  5.5 
in  mean-squared  error  over  existing  projection-based  registration  algorithms  at  a 
minimal  computational  cost. 

The  dissertation  also  extends  the  bounds  on  image  registration  performance  for 
both  projection-based  and  full-frame  image  registration  algorithms  and  extends  the 
Barankin  Bound  from  the  one-dimensional  case  to  the  problem  of  two-dimensional 
image  registration.  The  Cramer-Rao  and  Barankin  bounds  are  calculated  for  regis¬ 
tration  performed  using  2-D  registration  algorithms  and  compared  to  bounds  on  reg¬ 
istration  estimates  calculated  using  computationally  efficient  projection-based  reg¬ 
istration  algorithms.  It  is  demonstrated  that  in  some  instances,  the  Cramer-Rao 
lower  bound  is  an  overly-optimistic  predictor  of  image  registration  performance  and 
that  under  some  conditions  the  Barankin  bound  is  a  better  predictor  of  shift  esti¬ 
mator  performance.  These  conditions  include  low-SNR  imaging  and  imaging  under 
defocus  error,  two  conditions  which  are  frequently  encountered  in  military  imaging 
systems  that  employ  passive  infrared,  light  radar  (LIDAR),  and  synthetic  aperture 
radar  (SAR). 

The  research  looks  at  the  related  problem  of  single-frame  image  denoising  us¬ 
ing  block-based  methods.  The  research  introduces  three  new  algorithms  for  single¬ 
frame  image  denoising  that  operate  by  identifying  regions  of  interest  within  a  noise- 
corrupted  image  and  then  generating  noise  free  estimates  of  the  regions  as  averages 


IV 


of  similar  regions  in  the  image.  The  algorithms  are  shown  to  outperform  Wiener 
and  median  filtering  over  a  wide  range  of  noise  conditions  but  are  most  effective  in 
images  with  very  low  signal-to-noise  ratios. 
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I.  Introduction 

For  the  United  States  Air  Force,  images  are  crucial  for  real-time  intelligence, 
target  planning,  and  flight  operations.  Images  are  also  analyzed  extensively  by 
military  personnel  engaged  in  medical  diagnostics,  nondestructive  inspection,  astron¬ 
omy,  security,  law  enforcement  and  counterintelligence.  Not  only  are  these  images 
used  for  a  variety  of  purposes,  they  are  also  taken  under  a  variety  of  adverse  con¬ 
ditions  and  with  illumination  sources  that  range  in  frequency  from  ultrasound  to 
x-ray.  In  order  to  minimize  power  consumption  to  avoid  detection  of  the  receiver 
in  an  operational  scenario,  the  imaging  systems  being  used  may  be  passive  (as  with 
passive  infrared  systems)  or  may  use  signals  of  opportunity  such  as  street  lights  for 
illumination  sources.  Compounding  these  challenges,  the  receiver  may  be  located  at 
a  distance  miles  away  from  the  target,  may  have  only  a  fleeting  glance  at  a  target, 
and  may  be  degraded  by  severe  thermal  or  weather  conditions.  These  systems  may 
also  be  used  in  unpredictable  environments  with  insufficient  illumination,  signifi¬ 
cant  background  noise,  and  spurious  electronic  emissions.  Despite  theses  challenges, 
as  noted  by  Driggers  et  al.  [20],  these  imaging  systems  may  be  used  to  determine 
whether  or  not  to  fire  on  a  particular  target.  Thus,  the  ability  to  receive  and  interpret 
images  is  key  to  making  decisions  having  lethal  consequences. 

1.1  Problem  Setting 

The  ongoing  challenge  addressed  by  the  research  described  in  this  dissertation 
is  the  estimation  of  underlying  image  parameters  from  available  data  when  that  data 
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is  corrupted  by  additive  noise.  Some  amount  of  noise  is  always  present  in  digital 
imaging  systems  for  reasons  eloquently  described  by  Snyder  et  al.  [51].  These  noise 
sources  include  random  variations  in  photon  conversions  during  object  illumination, 
thermal  noise  in  the  sensor,  background  noise  caused  by  luminescent  radiation  in  the 
area  of  the  sensor,  sensor  biases,  and  random  noise  induced  by  the  sensor  amplifier 
during  readout  of  the  individual  pixel. 

The  approach  used  in  this  research  to  solve  this  estimation  problem  is  to  fuse 
available  information  to  estimate  the  noise-free  intensity  of  a  scene,  the  motion  of 
the  imaging  sensor,  or  the  motion  of  an  object  within  an  image.  One  aspect  of  this 
problem  that  will  be  explored  is  the  fusion  of  multiple  frames  of  the  same  scene  to 
create  an  estimate  of  intensity  scene  that  is  better  than  any  single  frame  of  that 
scene. 


1.1.1  Image  Registration.  When  a  sensor  is  able  to  take  many  frames  of 
the  same  scene,  or  when  many  frames  of  the  same  scene  are  available  from  different 
sensors,  it  is  possible  to  combine  these  frames  to  estimate  the  parameters  of  the 
scene  or  the  sensors.  A  simple  example  of  this  is  shown  in  Figure  1.1  where  several 
frames  of  the  same  scene  are  averaged  to  form  an  improved  estimate  of  an  image  of 
a  tank.  However,  the  ability  to  perform  this  fusion  is  much  more  complicated  than 
this  figure  indicates. 

In  an  actual  imaging  system  the  sensor  creating  the  image  may  be  moving 
or  vibrating  and  the  frames  of  the  image  captured  by  the  sensor  may  be  separated 
by  some  unknown  spatial  offset.  The  process  of  estimating  this  offset  and  aligning 
the  image  frames  is  called  image  registration  [7].  If  the  images  can  be  registered 
correctly,  the  estimated  offset  of  the  frames  may  be  used  to  estimate  the  motion  of 
the  sensor.  This  real-time  application  of  the  data  requires  registration  algorithms 
that  are  fast  and  highly  accurate. 
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PSNR  =  24.5799 


Figure  1.1:  Example  of  multiple  frames  of  the  same  scene  used 
to  improve  the  estimate  of  a  scene. 

The  research  proposed  in  this  dissertation  addresses  this  need  by  introducing  a 
novel  method  for  optimizing  the  performance  of  projection-based  image  registration 
algorithms.  To  keep  this  examination  tractable,  the  problem  is  bounded  to  include 
only  translational  shifts.  The  mathematical  model  which  will  be  used  throughout  this 
dissertation  to  describe  this  operation  is  now  defined  where  it  is  assumed  there  are 
two  N  x  N  observations  of  an  image  I  corrupted  with  additive  white  Gaussian  noise 
(AWGN)  Q  where  I(x,y),Q(x,y)  G  M  so  that  D (x,y)*  =  D (x,y)  under  complex 
conjugation. 

For  the  one-dimensional  case,  an  operator  TQ  is  defined  that  acts  on  a  vector 
i  such  that 


(TQi)(x)  =  i  (x-a)  (1.1) 

For  the  two-dimensional  case,  an  operator  Ta,y  is  defined  that  acts  on  a  vectorized 
version  of  I  such  that 


(Toggl  )(x,y)  =  I  (x-a,y-P) 


(1.2) 
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Ta  and  Ta)Jg  are  unitary,  norm-preserving  operators  that  can  be  realized  using  cir- 
culant  matrices  and  have  the  properties 


T  T 

rr'(cn+02)  5 

(1.3) 

(T„f 

—  T 

J — a, 

(1.4) 

TQi,/3iTai,/32 

—  +(h)i 

(1.5) 

—  T(0,1_|_Q,2)  ^  , 

(1.6) 

(Tq>/3)T 

=  T -a,-p- 

(1.7) 

Subscripting  with  n  G  {1,2}  to  indicate  the  number  of  the  observation,  two 
frames  of  image  data  are  now  defined 


Di  —  I  +  Qi, 


(1.8) 


D2  —  TQj/gI  +  Q2, 


(1.9) 


where  a  and  (3  are  shifts  of  the  diffraction-limited  image  I  in  the  x  and  y  direc¬ 
tions  respectively.  Furthermore,  the  additive  noise,  Qi  and  Q2  are  defined  to  be 
independent  and  identically  distributed  Gaussian  with  zero  mean  and  variance  cr2. 

The  optimization  method  introduced  by  this  dissertation  employs  projections 
of  sequential  image  frames  to  create  an  estimate  of  their  spatial  offset.  This  is  shown 
to  be  a  computationally  simple  approach  that  is  also  highly  accurate.  The  x  and  y 
projections  of  D„  are  defined  in  the  dissertation  as  d.itX(y)  and  drhy{x)  where  [11] 


N- 1 

dn,,a;(l/)  =  ^  ^  Dn((T,  |/), 

x=0 

N- 1 

(1.10) 

dn,y{x)  =  n(x,y). 

y= 0 

(1.11) 
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Expanding  these  terms  for  frame  i  in  direction  y  yields 


TV— 1  TV-1 

<WX)  =  +  (L12) 


Ltd  q-(d 


where  iy(x)  and  qn(a;)  are  introduced  as  notation  for  the  projection  of  the  diffraction 
limited  image  and  the  projection  of  the  noise  in  the  image.  Note  that  n  e  {1,2} 
is  an  index  and  i  is  a  vector  projection  of  the  data.  The  research  described  in  the 
dissertation  includes  the  introduction  of  a  novel  method  for  filtering  these  projections 
to  improve  the  accuracy  of  estimates  of  image  translations. 

The  accuracy  of  estimates  on  translations  between  images  is  a  function  of  image 
content  and  the  amount  of  additive  noise  in  the  images.  Limits  on  this  accuracy 
have  been  predicted  using  information-theoretic  analytical  tools  such  as  the  Cramer- 
Rao  lower  bound  [53].  However,  this  commonly-used  bound  provides  an  incomplete 
description  of  the  behavior  exhibited  by  image  registration  algorithms  under  the 
high-noise  conditions  present  in  some  military  imaging  systems.  Linder  high-noise 
conditions,  the  Cramer-Rao  lower  bound  is  shown  in  this  research  to  be  an  overly- 
optimistic  predictor  of  estimator  performance. 

To  address  this  shortcoming,  this  dissertation  applies  the  Barankin  bound  to 
the  image  registration  problem.  This  bound,  used  traditionally  for  estimating  trans¬ 
lational  shifts  between  one- dimensional  signals  [37,38,41],  is  extended  here  to  the 
two-dimensional  problem  of  image  registration.  The  Barankin  bound  is  used  to  find 
a  lower  bound  on  the  mean-squared  error  of  shift  estimates  generated  from  images 
that  are  out  of  focus  and  corrupted  by  noise.  It  is  also  employed  to  compare  the 
performance  of  registration  estimates  generated  from  projections. 

1.1.2  Single- Frame  Image  Denoising.  Single- frame  image  denoising  al¬ 

gorithms  combine  information  from  a  single  noise-corrupted  image  to  estimate  the 
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Figure  1.2:  Examples  of  apparent  redundant  subimages  in  an 
image. 

intensity  of  the  underlying  noise-free  scene.  A  variety  of  approaches  have  been  used 
to  perform  this  denoising  [1,6,9,12,13,16,17,31,32,40,49,52].  All  of  these  approaches 
employ  some  measure  of  similarity  for  comparing  and  combining  individual  pixels 
in  an  image.  Several  of  these  methods  [9,31,32]  compare  pixels  by  measuring  the 
similarity  of  local  neighborhoods  around  these  images. 

As  shown  in  Figure  1.2,  it  is  a  often  a  simple  exercise  to  at  least  roughly 
identify  similar  regions  within  images  that  may  contribute  to  a  noise-reduced  block 
average.  If  mean  of  the  noise  in  the  redundant  subimages  is  zero,  averaging  should 
yield  a  result  that  is  close,  in  an  L2-norm  sense,  to  the  diffraction-limited  subimage. 
This  process  is  similar  to  multiframe  image  denoising  using  small  frames  but  has  the 
additional  challenge  of  differentiating  similar  from  dissimilar  content.  These  methods 
have  been  shown  to  be  effective  [9,31,32];  however,  the  methods  in  the  literature  also 
rely  on  parameters  that  are  specific  to  the  image  under  study  and  may  be  further 
improved. 

This  dissertation  introduces  several  new  image  processing  algorithms  that  com¬ 
pare  and  combine  subimages  within  a  frame  to  provide  denoising  performance  that 
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is  on  par  with,  and  in  some  cases  superior  to,  the  performance  achieved  by  other 
state-of-the-art  algorithms.  Unlike  similar  block-based  algorithms,  these  algorithms 
suspend  the  requirement  for  similar  blocks  to  be  located  in  close  spatial  proximity 
to  each  other.  This  is  shown  to  improve  denoising  performance  over  existing  block- 
based  methods  in  the  most  severely  corrupted  images  (as  measured  by  signal-to-noise 
ratio  (SNR))  while  reducing  reliance  on  image-specific  parameters  in  the  algorithm. 

1.2  Purpose  of  the  Research 

The  research  described  in  this  dissertation  examines  the  interrelated  problems 
of  image  registration  and  image  denoising.  It  develops  novel  methods  to  improve 
the  performance  of  existing  registration  and  denoising  algorithms.  It  also  extends 
the  general  understanding  of  the  performance  of  image  registration  algorithms  by 
extending  the  existing  one-dimensional  applications  of  the  Barankin  bound  to  the 
two-dimensional  problem  of  image  registration. 

1.3  Overview 

As  a  synopsis  of  the  dissertation  structure,  the  following  outline  is  provided: 
Chapter  II  provides  a  review  of  related  research  and  theoretical  background  that  will 
support  discussion  of  the  concepts  in  the  following  chapters.  Chapter  III  introduces 
a  computationally-efficient  method  for  estimating  translational  shifts  of  frames  of  the 
same  scene.  This  method  can  be  used  for  estimating  the  motion  of  a  sensor  between 
frames  or  to  facilitate  multiframe  averaging.  Chapter  IV  examines  and  extends  the 
theoretical  bounds  on  estimates  of  shifts  between  similar  images  and  subimages. 
The  effects  of  filtering,  projecting  and  defocus  errors  in  images  are  examined  and 
the  Barankin  bound  on  image  registration  is  introduced  and  demonstrated  to  be 
relevant  to  projections  and  low-intensity  image  processing.  Chapter  V  introduces 
several  new  algorithms  that  exploit  regularity  in  an  image  to  facilitate  denoising 
when  only  a  single  frame  of  a  scene  is  available.  The  dissertation  concludes  in 
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Chapter  VI  with  a  summary  of  the  research  in  this  dissertation  and  proposals  for 
further  work  on  statistical  image  processing  problems. 


II.  Literature  Review 


The  problems  of  image  registration  and  single-frame  image  denoising  have  been 
extensively  studied  in  the  literature.  This  survey  of  existing  work  looks  at 
those  methods  from  recent  literature  that  offer  the  most  computationally-efficient 
methods  for  estimating  shifts  between  images.  This  review  identifies  opportunities 
for  improving  on  existing  methods  for  projection-based  image  registration  and  iden¬ 
tifies  a  method  for  improving  the  accuracy  of  estimates  on  the  performance  limits 
of  registration  estimates.  The  review  also  examines  statistical  methods  rooted  in  in¬ 
formation  theory  for  analyzing  the  performance  of  those  methods.  The  review  then 
covers  a  number  of  related  single-frame  image  denoising  algorithms  that  are  similar 
in  nature  to  the  multiframe  averaging  methods  that  are  facilitated  by  image  regis¬ 
tration.  The  review  provides  insight  into  ways  that  existing  block-based  algorithms 
may  be  improved. 

2.1  Projection- Based  Image  Registration 

Image  registration  as  described  in  this  dissertation  is  the  process  of  spatially 
aligning  images  which  may  have  been  taken  by  different  sensors  or  were  captured  at 
different  times.  This  spatial  alignment  may  be  used  for  multiframe  image  denoising 
[36]  or  for  estimating  camera  motion  [46].  Images  may  be  registered  using  a  variety 
of  techniques  including  cross  correlations,  Fourier  transforms,  and  by  identifying 
and  aligning  features  within  different  images.  An  overview  of  these  and  other  image 
registration  algorithms  is  available  in  [7]. 

If  an  imaging  sensor  is  used  primarily  for  motion  estimation,  image  projections 
offer  what  is  perhaps  the  fastest  approach  for  registering  available  images  at  a  low 
computational  cost  [11,45].  This  speed  is  the  result  of  faster  data  acquisition  times 
and  reduced  computational  complexity.  As  described  by  Cain  [10],  existing  Charge- 
Coupled  Devices  (CCDs)  need  to  be  read  out  serially  if  an  entire  image  is  to  be 
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acquired.  However,  if  only  a  projection  is  required,  the  projection  may  be  formed  by 
integrating  charges  corresponding  to  photon  counts  across  the  vertical  and  horizontal 
axes  of  some  existing  CCDs  [10].  Although  less  image  information  is  available  in  the 
projections  than  in  a  full  2-D  image,  Cain  et  al.  note  that  in  the  presence  of  fixed 
pattern  and  temporal  noise,  projection-based  methods  can  provide  performance  that 
is  actually  superior  to  that  of  2-D  cross  correlations  [11]. 

In  addition,  registering  the  image  projections  requires  two  1-D  cross  correla¬ 
tions  instead  of  one  2-D  cross  correlation.  Cross-correlation  of  N  x  N  2-D  images 
is  most  efficiently  performed  in  the  frequency  domain  and  requires  a  total  of  three 
fast  Fourier  transforms  (FFTs)  yielding  a  computational  complexity  of  0(6N2logN). 
For  the  same  image,  the  cross-correlation  of  its  projections  in  the  frequency  domain 
requires  six  FFTs  for  a  total  complexity  of  0(6 NlogN).  Using  projections,  methods 
involving  only  real  numbers  (and  possibly  only  integers)  become  feasible  yielding 
further  reductions  in  computational  complexity.  For  motion  estimation  purposes, 
this  combination  of  readout  speed  and  low  computational  complexity  makes  these 
algorithms  especially  attractive. 

Cain  et  al.  [11]  describe  how  registering  images  using  their  projections  can 
improve  performance  over  2-D  correlation  methods  when  significant  fixed  pattern 
noise  (FPN)  is  present  in  the  images.  The  mechanism  behind  this  is  that  the  signal 
in  the  projections  is  correlated  and  the  FPN  is  assumed  to  be  uncorrelated.  They 
go  on  to  note  that  the  ability  to  correctly  register  two  images  in  correlation-based 
image  registration  is  dependant  not  only  on  the  height  of  the  autocorrelation  peak, 
but  also  on  the  difference  between  the  peak  and  other  points  on  the  autocorrelation. 

A  windowing  operator  is  used  in  their  calculation  which  is  denoted  as  Wf  and 
is  realized  computationally  as  a  diagonal  matrix  where  the  diagonal  elements  are 


Wf(zy) 


1 

0 


| z  —  mp |  <  (N/ 2  —  Ss) 
else 


(2.1) 
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where  the  variable  mp  is  the  value  of  z  corresponding  to  the  midpoint  of  a  projection 
(i.e.  the  index  of  the  center  point  of  the  projection)  and  8S  is  the  maximum  allowable 
shift  value  between  the  frames.  This  windowing  function  is  necessary  to  ensure 
that  the  data  overlaps  for  a  sufficient  number  of  terms  of  the  cross-correlation  and 
mitigates  the  reduction  in  cross  correlation  power  that  would  otherwise  occur  for 
two  identically  sized  projections. 

Using  the  projections  and  the  windowing  function  notation,  Cain  et  al.  com¬ 
pute  the  1-D  cross  correlations  of  the  windowed  projections  of  two  images  as  [11] 

P y  =  di;y  *  Wfd2,y  -  Wfdi,y  ©  d 2,y,  (2.2) 

Px  =  d1;a.  *  Wfda,*  -  WfdM  ©  d2tX.  (2.3) 

where  diz,  z  G  {x,y}  denotes  a  vector  of  length  N  with  all  elements  equal  to  the 
scalar  average  of  a  given  projection  and  (Wfdi)2)  is  a  vector  with  the  elements  equal 
to  the  mean  of  d ,  2  over  a  windowed  area  for  a  given  index  of  d j  2.  The  notation  * 
is  used  to  indicate  convolution  and  0  to  indicate  a  Hadamard  multiplication.  The 
windowing  function  in  (2.2)  and  (2.3)  effectively  bounds  the  search  area  for  the 
registration  peak  to  a  subregion  within  di)3;  or  di)S/.  This  windowing  function  should 
be  defined  based  on  a  bound  on  the  translation  between  frames.  Large  translations 
will  necessitate  a  small  window  and  vice  versa. 

The  shift  estimate  is  computed  from  the  projections  defined  in  (2.2)  and  (2.3) 
and  is  calculated  as 


a  =  argmax  |p.„(z)|,  (2.4) 

Z 

$  =  argmax|pa.(z)|.  (2.5) 

Z 


Cain  et  al.  [11]  also  describe  a  figure  of  merit  (FOM)  that 
evaluate  the  ability  to  differentiate  the  shift  between  two  images. 


can  be  used  to 
Cain’s  FOM  is 
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defined  such  that  z  is  the  integer-valued  index  of  a  point  on  the  cross  correlation 
between  two  projections  and  a  and  j3  are  the  actual  shifts  in  the  x  and  y  directions. 
The  FOM  is  stated 


fp„(z,a) 

FP,(z,P) 

where  the  notation  E[.]  indicates  the  expected  value  of  a  random  variable  and  z  is 
used  as  an  integer  index  for  the  cross  correlation. 

Although  this  FOM  is  accurate  for  the  algorithm  described,  if  the  projections 
are  filtered  the  noise  in  adjacent  points  of  the  cross  correlation  becomes  correlated 
and  the  assumptions  under  which  the  FOM  was  developed  no  longer  apply.  This 
points  to  a  need  to  further  generalize  the  FOM  to  account  for  filtering. 

2.2  Filtering  Images  to  Facilitate  Image  Registration 

Filtering  images  prior  to  their  registration  has  been  discussed  extensively  in 
the  literature.  It  has  also  been  used  in  the  related  problem  of  time-delay  estimation 
of  1-D  signals  [33];  however,  the  bounds  on  registration  from  filtered  images  has  not 
been  fully  explored. 

Barron  [3]  notes  that  prefiltering  to  smooth  images  is  a  common  first  stage  of 
many  image  registration  algorithms.  Filtering  to  improve  registration  performance 
is  also  described  in  [4,39,46]  and  this  improvement  has  been  attributed  to  a  variety 
of  sources.  For  example,  Bergen  et  al.  [4]  suggest  that  filtering  improves  registra¬ 
tion  performance  by  eliminating  high  frequency  image  content  that  is  most  likely 
to  include  the  effects  of  aliasing.  Filtering  may  also  correct  for  estimation  biases 
resulting  from  fixed  pattern  noise  [11],  or  from  biases  inherent  to  the  estimation 
method  used  [46,47].  Elad  et  al.  [22]  discuss  the  design  of  filters  for  gradient-based 
motion  estimation  and  show  a  way  to  design  filters  that  combine  smoothing  and 


(E[py(a)]  —  E[py(z)])2 
E[VAR[py(z)|i]  +  VAR[py(a)|i]]  ’ 
(E[py(/3)]  -  E[py(z)])2 
E[VAR[Py(z)|i]  +  VAR[py(/3)|i]]  ’ 


(2.6) 

(2.7) 
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gradient  based  estimation.  Elad  et  al.  [22]  also  provide  a  summary  of  the  filtering 
technique  proposed  by  Simoncelli  et  al.  in  [50]  which  contains  a  presmoothing  step. 
The  filters  arrived  at  in  these  papers  are  essentially  a  combination  of  low-pass  fil¬ 
ters  that  remove  extreme  high-frequency  content  and  high-pass  filters  that  remove 
low-frequency  image  biases. 

One  low-pass  filter  design  approach  that  has  been  explored  in  the  literature  is 
Wiener  filtering  two  images  before  attempting  to  register  them  [34],  This  method 
may  also  be  applied  to  the  projections  of  an  image.  Using  a  signal  Power  Spectral 
Density  (PSD)  calculated  from  the  noise-free  image  (So)  and  a  noise  PSD  calculated 
from  the  known  characteristics  of  our  noise  (Sn),  an  optimal  filtering  kernel  can  be 
calculated  for  a  projection  as  [26] 


X 


So 

So  +  Sn 


(2.8) 


The  filtering  kernel  X  is  then  multiplied  with  the  Fourier  transform  of  the 
projection.  The  inverse  Fourier  transform  of  the  result  of  this  multiplication  is  the 
filtered  projection.  An  exact  calculation  of  the  PSD  or  the  autocorrelation  relies 
on  either  a  priori  knowledge  or  estimation  from  noisy  data  using  techniques  such 
as  those  described  by  Kay  [30].  Elad  [22]  also  suggests  that  the  spectral  content 
of  a  series  of  images  should  be  considered  in  designing  a  filter  to  improve  image 
registration  performance. 

This  background  on  filtering  may  be  combined  with  a  FOM  that  is  general¬ 
ized  from  (2.7)  to  design  simple  filters  that  are  optimized  to  minimize  the  MSE  of 
registration  estimates.  The  difference  between  the  filters  proposed  and  Wiener  (or 
other)  filtering  is  that  the  proposed  filters  use  binary  coefficients  that  allow  them 
to  be  implemented  as  integer  additions  in  combinational  logic.  These  simple  and 
computationally  inexpensive  filters  can  be  used  to  achieve  effects  approaching  those 
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of  more  complicated  Wiener  filters  that  need  to  be  implemented  using  floating-point 
calculations. 


2.3  Bounds  on  the  Mean-Squared  Error  of  Estimators 

The  performance  of  an  estimator  used  for  estimating  shifts  between  images  is 
governed  by  fundamental  statistical  limits.  A  variety  of  approaches  have  been  used  to 
examine  bounds  on  the  performance  of  image  registration  algorithms;  however,  one 
of  the  most  common  approaches  is  the  Cramer- Rao  lower  bound  (CRLB).  Robinson 
and  Milanfar  derive  the  CRLB  on  the  performance  of  image  registration  algorithms 
in  [46].  They  derived  related  bounds  in  [48]  for  analyzing  the  performance  of  super¬ 
resolution  imaging.  Yetik  and  Nehorai  provide  extensive  analysis  and  derivations  of 
Cramer- Rao  lower  bounds  for  a  variety  of  geometric  distortion  models  in  [55]. 

Van  Trees  shows  that  the  CRLB  on  an  unbiased  estimate  a  of  a  single  non- 
random  parameter  of  a  vector  of  data  d  is  [53] 


VAR[a(d)]  > 


d2  lnp(d|a) 
da 2 


(2.9) 


Where  multiple  non-random  parameters  are  estimated,  a  Fisher  Information  Matrix 
(FIM),  J  can  be  derived.  Say  a  vector  of  non-random  parameters  ©  is  to  be  estimated 
from  a  vector  of  observations,  d.  Each  element  of  J  located  at  index  (i,j)  can  be 
defined  [53] 


d 2  lnp(d|@) 

_00(i)d0(j)_ 


(2.10) 


The  phase  shift  between  two  vectors  of  identical  data  received  through  two  different 
channels  is  an  example  of  a  parameter  that  is  commonly  estimated.  For  applications 
with  low  SNRs,  estimates  of  this  shift  exhibit  thresholding  behavior  [37, 41]  which 
is  not  captured  by  the  CRLB.  As  the  SNR  decreases,  there  may  be  a  point  in  the 
measured  MSE  where  shift  estimation  errors  begin  to  exceed  those  predicted  by 
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the  CRLB  even  though  the  estimator  is  capable  of  reaching  the  CRLB  at  higher 
SNRs.  This  thresholding  occurs  because,  as  the  noise  in  an  image  increases,  it 
becomes  increasingly  likely  that  registration  errors  will  occur  at  the  subpeaks  of 
the  autocorrelation  of  an  image.  This  behavior  is  observed  but  not  quantified  in 
Robinson  and  Milanfar’s  work  on  performance  bounds  [46]. 

This  deficiency  in  quantifying  this  thresholding  behavior  can  be  resolved  by 
examining  another  bound  used  traditionally  in  high-noise,  low-signal-strength  envi¬ 
ronments  for  estimating  delays  between  signals.  The  Barankin  bound  has  been  used 
for  time-delay  estimation  of  one-dimensional  signals  [37,38,41],  In  the  literature, 
thesholding  behavior  is  predicted  and  estimated  by  the  Barankin  bound  in  the  one¬ 
dimensional  problems  of  radar  and  sonar  returns  in  the  literature  [37,38,41],  If  the 
true  values  of  a  vector  to  be  estimated  are  represented  by  the  vector  ©0,  a  shifted 
version  of  the  vector  likely  to  produce  an  error  can  be  represented  as  the  vector  ©n. 
Considering  the  most  likely  values  of  ©n,  the  Barankin  bound  of  an  unbiased  estima¬ 
tor  an  estimate  ©  of  a  vector  of  true  parameters  ©o  can  be  written  and  calculated 
as  [37] 


u2  >  J  x  +  ($-  J-1A)(A”1)($  -  J^XA)T,  (2.11) 


where  A  =  B  —  A1  J  1A,  J  is  the  FIM  calculated  as  in  the  CRLB,  and  where 


A  ij  —  E 


Bty 


<91np(d|©c 


$  = 


aeo(i)  -L(d|e’'e») 

E  [L(d|©i,  0o)L(d|@j,  @0)] , 

[@1,  ©2,  •••!  ©n]j 


(2.12) 

(2.13) 

(2.14) 


and  L(d\&i,  ©o)  is  the  likelihood  function 


L(d|0i,0o) 


p(d|©Q 

P(d|©0)' 


(2.15) 
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The  notation  ©j  is  used  to  indicate  values  of  ©  other  than  its  true  value  ©o  and 
0j(i)  to  indicate  a  scalar  element  of  ©j  indexed  at  i. 

Because  time-delay  estimation  is  mathematically  similar  to  spatial  displace¬ 
ment  estimation  in  images,  this  bound  may  be  extended  to  two-dimensional  spatial 
estimation  to  better  predict  the  behavior  of  the  effects  of  increasing  noise  on  the 
performance  of  correlation-based  image  registration.  This  bound  may  also  help  to 
explain  difficulties  in  estimating  shifts  when  images  are  blurred  as  is  the  case  with 
inadvertent  focal-length  errors. 


2-4  Calculating  the  Optical  Transfer  Function  of  an  Imaging  System 

Focal-length  errors  can  be  modeled  using  Fourier  optics.  In  his  book  on  this 
subject,  Goodman  describes  the  effects  of  defocus  errors  on  an  incoherent  optical 
system  [24]  by  describing  an  Optical  Transfer  Function  (OTF)  as  a  function  of  a 
generalized  pupil  function.  This  approach  requires  the  use  of  the  Amplitude  Transfer 
Function  (ATF),  the  wavelength  (A)  of  light  being  detected,  the  size  and  shape  of 
the  pupil,  the  distance  between  the  optical  system  and  the  focal  plane  as  input 
parameters  (zf),  and  the  distance  between  the  optical  system  and  the  plane  of  the 
detector  (za,  assuming  Z{  ^  za  as  in  [24]).  Using  (x,y)  to  indicate  coordinates  in  the 
aperture  plane  where  (0,  0)  is  located  at  the  center  of  a  circular  aperture  of  width 
21,  a  pupil  can  be  expressed  as  [24] 


P  (x,y) 


1  :  \j x1  +  y2  <  l 

0  :  else 


(2.16) 


Goodman  uses  this  pupil  function  to  describe  the  frequency-domain  ATF  of  a  system 
with  a  focal  length  abberation  using  frequency  coordinates  (/x,  fy)  as 


S  (/*,/„) 


P(Xzifx,\zify)exp 


j2vr 

A 


W(Xzifx,Xzify )  , 


(2.17) 
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where 


W(x,y) 


-  (—  - 

2  \  za  ZiJ 


(x2  +  y2). 


(2.18) 


Finally,  the  OTF  'K(fXl  fy)  can  be  calculated  by  first  calculating  the  2-D  autocorre¬ 
lation  of  the  ATF  and  then  normalizing  so  that  the  peak  of  the  OTF  equals  unity. 
The  inverse  Fourier  transform  of  J~C  will  be  designated  H0.  This  optical  filtering  is 
performed  before  readout  noise  is  added  to  the  image  which  affects  the  formation  of 
the  PDF  of  the  image  in  that  additive  noise  is  uncorrelated  in  the  filtered  pixels  of 
the  optically  filtered  image.  To  derive  CRLBs,  analysis  is  limited  to  those  values  of 
J~C  which  can  be  reasonably  approximated  by  non-singular  H0  (i.e.  sub-wavelength 
defocus  errors). 


2.5  Existing  Single-Frame  Denoising  Algorithms 

In  order  to  reduce  the  computational  complexity  or  output  performance  of 
modern  image  denoising  algorithms,  it  is  important  that  one  has  an  understanding 
of  how  these  algorithms  work.  To  this  end,  this  section  provides  an  overview  of 
related  image  denoising  algorithms  that  selectively  average  pixels  within  a  single 
image.  Image  processing  literature  is  rife  with  methods  for  denoising  images  and 
this  review  will  not  attempt  to  address  all  of  these  methods.  Good  overviews  of 
traditional  single-frame  image  denoising  algorithms  may  be  found  in  [9]  and  in  [26]; 
however,  an  overview  of  the  methods  that  are  most  similar  to  the  algorithms  proposed 
in  this  research  is  provided.  These  similar  methods  are  those  that  reduce  noise  while 
attempting  to  maintain  high  spatial  frequency  image  content. 

The  inadvertent  reduction  of  high-frequency  content  is  the  Achilles’  heel  of 
many  basic  image  processing  algorithms.  Some  of  the  simplest  image  smoothing 
algorithms  are  linear,  shift-invariant  filters  such  as  the  Gaussian  filter,  the  Wiener 
filter  [26],  and  the  mean-shift  algorithm  which  replaces  pixels  in  an  image  with 
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the  mean  of  the  pixels  in  a  surrounding  neighborhood  [16,23].  These  algorithms 
reduce  the  noise  power  in  an  image  through  averaging;  however,  because  they  average 
without  respect  to  local  content,  they  also  tend  to  blur  high-frequency  image  content. 
This  deficiency  necessitates  the  study  of  more  complex  denoising  algorithms  that  will 
maintain  this  high  frequency  content  while  removing  additive  noise  from  the  image. 

In  an  effort  to  remove  noise  while  preserving  high-frequency  content,  research 
has  turned  to  nonlinear  filters.  One  of  the  simplest  and  most  commonly  employed 
nonlinear  methods  for  reducing  noise  power  is  the  median  filter  which  replaces  indi¬ 
vidual  pixels  in  an  image  with  the  median  value  of  pixels  in  a  defined  neighborhood 
surrounding  these  pixels  [26] .  This  filter  generally  provides  better  results  than  Gaus¬ 
sian  smoothing;  however,  better  performance  may  be  obtained  using  more  advanced 
smoothing  techniques. 

Neighborhood  filtering  is  one  nonlinear  approach  to  image  denoising  that  has 
shown  recent  promise  in  the  literature  and  has  produced  results  that  improve  upon 
those  of  many  other  image  denoising  algorithms.  In  a  neighborhood  filter,  the  algo¬ 
rithm  assigns  an  output  value  to  a  pixel  based  on  an  evaluation  of  the  relationships 
between  pixels  in  a  surrounding  neighborhood.  Many,  if  not  most,  neighborhood 
filters  are  fundamentally  related  and  the  chapter  begins  by  highlighting  these  re¬ 
lationships  as  it  reviews  these  algorithms.  The  algorithms  reviewed  include  total 
variation  minimization  [49],  anisotropic  diffusion  [40],  bilateral  filtering  [52],  the 
nonlocal  means  [9],  and  an  optimal  patch-based  algorithm  [31].  These  algorithms  all 
include  some  measure  of  pixel  similarity  that  allows  the  algorithm  to  differentiate 
between  similar  and  dissimilar  pixels  within  the  same  image.  All  of  the  algorithms 
discussed  do  this  in  ways  that  are  different  but  are  ultimately  related. 

2.5.1  Total  Variation  Minimization.  The  Total  Variation  (TV)  minimiza¬ 
tion  algorithm  is  an  iterative  image  smoothing  algorithm  that  preserves  edges  [13,49]. 
Unlike  the  other  algorithms  that  will  be  discussed  in  this  chapter,  TV  minimiza- 
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tion  identifies  similar  pixels  in  the  local  neighborhood  of  a  given  pixel  by  examining 
the  LI  norm  of  the  neighborhoods  surrounding  these  pixels.  This  algorithm  is  de¬ 
scribed  for  the  continuous  case  by  Chambollc  [12]  and  by  Rudin  et  al.  [49]  among 
others;  however,  because  digital  images  are  sampled  discretely,  the  most  relevant 
formulation  is  provided  by  Chan  et  al.  [13].  For  explanatory  purposes,  say  there  is 
a  pixel  D (x,y)  in  a  larger  image  D  of  size  S  xT  where  S  G  N  and  T  G  N  that  is 
indexed  using  x  G  S  =  {1, S'},  y  G  7  =  (1, T}  and  say  the  set  of  the  indices  of 
the  closest  neighbors  of  D (x,y)  is  defined  as  A.  Then,  the  local  variation  at  a  pixel 
D  (x,  y )  is  defined  as  [13] 

(2.19) 

To  avoid  discontinuities  in  later  calculations,  the  local  variation  in  (2.19)  is  modified 
to  be  the  regularized  local  variation 

I  VX)J/D|a  =  \J\W  XtyD\2  +  a2,  (2.20) 

where  a  is  small  constant  normally  chosen  on  the  order  of  10-4.  The  output  of  the 
filtering  is  then 


I  (x,y)=  h°‘p(D(i’j))~D(i,j)  +  haa(D(i,j))D(x,y) 

(i,j)€A 

where  between  two  pixels  a  indexed  at  D (i,j)  and  (3  at  D (x,y) 


hap{D(i,j)) 

j)) 

wap(D  (i,j)) 


Wgp(D(hj)) 

^  ^—j(x.y)dA^a  /3(D(2;,?/)), 

A 

A  +  E(^J)eylW«/3(D(x,^/)), 
1  |  1 
|VXjJ/D|a  +  |VjjD|a’ 


(2.21) 
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and  where  A  is  a  Lagrange  multiplier  chosen  for  a  known  additive  noise  of  variance 
a2  as 


A  =  ^Jf  5Z  ^2  w<*p(D(i,j)  ~~D(x,y))(D(x,y) -~D(x,y)°).  (2.22) 

xe$  ,yeT  (ij)e.A 

Since  this  is  an  iterative  algorithm,  the  notation  D°  is  used  to  indicate  the  original 
value  of  the  image.  It  is  interesting  to  note,  as  do  Chan  et  al.  [13],  that  this  algorithm 
gives  large  weights  to  those  pixels  in  areas  with  low  variation  and  large  weights  to 
those  with  low  variation.  The  effect  of  this  is  that  area  smoothing  occurs  with  less 
degradation  of  true  image  edges  than  would  be  found  with  a  linear  filter.  This  basic 
mechanism  is  also  present  in  anisotropic  diffusion  which  is  examined  in  the  next 
section. 

2.5.2  Anisotropic  Diffusion.  Another  method  of  image  smoothing  that 
attempts  to  minimize  degradation  of  image  edges  is  anisotropic  diffusion  [40].  This 
method  attempts  to  iteratively  smooth  an  image  by  treating  the  smoothed  image 
as  solutions  to  the  heat  equation  over  arbitrary  time  steps.  Perona  and  Malik  [40] 
create  an  anisotropic  diffusion  algorithm  that  accounts  for  edges  and  preserves  edges 
in  the  smoothed  image  by  assuming  them  to  be  differences  in  conductivity  for  heat 
flow.  For  an  iteration  of  the  algorithm  on  a  single  pixel  in  an  image  D, 

I (x,y,t)  =  div(c(x,  y,  f)Vi(x,  y,t—  1)),  (2.23) 

where  c(x,  y,  t )  indicates  the  conductivity  between  pixels,  t  is  an  arbitrary  time  step, 
l(x,y,  0))  =  D (x,y),  div  indicates  the  divergence  of  a  gradient,  and  V  is  a  function 
that  produces  the  image  gradient.  The  algorithm  preserves  edges  in  the  image  by 
modeling  conductivity  between  pixels  as  functions  of  the  image  gradient  so  that 
the  conductivity  between  a  pixel  at  a  point  (x,  y )  and  its  neighbors  at  time  t  is 
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c(x,y,t )  =  <?(VI)  where  g(VI)  is  proposed  variously  by  Perona  and  Malik  [40]  as 


9(VI) 

9(VI) 


1  + 


2  ’ 


and  by  Black  et  al.  [6]  as 


9(VI) 


i(l-||Vi||2)2  :  |Vl|  <  a 

0  :  else 


(2.24) 


where  a  is  a  scaling  constant  in  all  of  the  above  equations.  As  was  the  case  with 
TV  smoothing,  the  effect  of  this  algorithm  is  to  facilitate  smoothing  between  pixels 
in  regions  with  similar  intensities,  and  to  inhibit  smoothing  between  pixels  in  dis¬ 
similar  regions.  In  this  case,  the  algorithm  assigns  low  conductivity  values  to  pixel 
values  across  large  gradients,  and  high  conductivity  values  to  pixels  across  small 
image  gradients.  This  again  helps  to  preserve  the  edges  in  the  image  by  determining 
similar  and  dissimilar  pixels  adjacent  to  a  single  pixel  under  consideration.  The  next 
algorithm  attempts  to  perform  nonlinear  smoothing  using  a  non-iterative  method. 


2.5.3  Bilateral  Filtering.  Non-iterative  algorithms  are  generally  preferable 
to  iterative  algorithms  because  they  can  be  performed  with  improved  processing  time. 
Bilateral  filtering  [21,52]  is  a  non-iterative  smoothing  method  that  works  by  applying 
a  filtering  kernel  to  an  image  that  is  designed  to  avoid  smoothing  across  edges  within 
the  image,  ft  does  this  using  a  filtering  kernel  that  averages  pixels  according  to  two 
measures  of  distance:  spatial  distance  and  radiometric  distance.  The  part  of  the 
kernel  that  examines  spatial  distance  works  like  a  traditional  Gaussian  filter  and 
includes  pixels  in  the  average  that  are  spatially  close  to  a  given  pixel.  The  part  of 
the  kernel  that  examines  radiometric  distance  includes  pixels  in  the  average  that  are 
similar  in  intensity.  As  a  discrete  example,  say  that  there  exists  an  image  D  of  size 
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S  x  T  where  S  G  N  and  T  G  N  that  is  indexed  using  i  e  §  =  {1, S},  j  6  T  = 
{1, ...,  T}.  Then,  it  is  possible  to  find  a  smoothed  estimate  of  a  pixel  as 

i(bi)  =  (2-25) 

«es  ^ea 

In  this  equation,  fs((i,j),(u,v ))  is  a  function  that  provides  a  measurement  of  the 
spatial  distance  between  pixels,  /r((D(i,  j),  D(w,  v))  is  a  function  that  measures  the 
radiometric  distance  between  pixels,  and  Z(i,j )  is  a  normalizing  factor.  A  Gaussian 
example  provided  by  Tomasi  and  Manduchi  [52]  gives  these  as 

fs((i,j),(u,v))  =  exp  ((*  -  uf  +  {j  -  v)2)^j  , 

fr(D(i,j),D(u,v))  =  exp  ((D(i,j) -D(u,v))2)^J  , 

Z(i,j)  =  ^^fs((i,j),(u,v))fr(D(i,j),D(u,v)),  (2.26) 

ueS  vei 

where  as  and  are  empirically-determined  constants  that  spread  the  kernels  to 
achieve  the  desired  level  of  filtering. 

The  effect  of  combining  these  distance  measures  is  that  pixels  that  are  spatially 
close  but  not  close  in  intensity  (as  on  an  edge)  are  not  included  in  the  average. 
Interestingly,  Elad  [21]  showed  that  the  bilateral  filter  is  actually  mathematically 
equivalent  to  anisotropic  diffusion,  a  result  that  is  suggested  by  the  work  of  Barash 
[2] .  This  result  provides  a  mathematical  connection  to  the  next  smoothing  algorithm 
described  in  this  chapter. 

2.5.4  Nonlocal  Means.  Another  recent  approach  to  image  denoising  is  the 
nonlocal  means  (NLM)  algorithm  proposed  by  Buades  et  al.  [8,9].  This  algorithm 
compares  and  averages  pixels  chosen  by  evaluating  their  similarity  based  on  the 
similarity  of  their  local  neighborhoods.  Based  on  this  measure  of  similarity,  this 
algorithm  assigns  weights  to  other  pixels  of  the  same  image  and  then  uses  a  weighted 


22 


sum  approach  to  arrive  at  a  denoised  version  of  the  pixel  of  interest.  Kervrann  and 
Boulanger  [32]  provide  the  link  to  some  of  the  other  algorithms  in  this  chapter  when 
they  note  that  when  the  neighborhood  size  used  in  the  NLM  to  compare  pixels  is 
equal  to  1,  the  NLM  algorithm  reduces  to  the  bilateral  filter.  The  unique  aspect  of 
this  filter,  however,  is  that  it  measures  similarity  between  pixels  by  comparing  the 
neighborhoods  around  the  pixels,  rather  than  examining  the  pixel  values  themselves. 

By  way  of  explanation,  suppose  there  again  exists  an  image  D  of  size  S  x  T 
where  S  G  N  and  T  e  N.  For  a  given  pixel  located  at  the  neighborhood  around 
the  pixel  can  be  represented  as  an  N  x  iV-sized  subimage  F ij  C  D  where  Fjj  is 
centered  at  i  G  §  =  {1, ...,  S},  j  G  T  =  {1, T}  and  where  {N  =  2n  +  1  n  e  N}. 
If  a  zero  pad  is  added  to  image  D  by  n  in  all  directions  then,  for  all  s  €  §  and  t  E  T, 
there  are  N'2  —  1  other  subimages  in  D  which  may  be  similar  in  an  L2-norm  sense  to 
F.  These  subimages  are  denoted  as  G sj-  The  NLM  algorithm  then  assigns  weights 
A  (s,  f)  e  M  to  each  GS)t  and  constructs  a  denoised  version  of  the  center  pixel  of  F,j 
as 

NLM(D(i,j))=  J]  A(s,  t)GS:t(n  +  1,  n  +  1),  (2.27) 

seS,teT 

where  the  center  pixel  of  GS;t  is  found  at  the  block  coordinates  (n  +  l,  n  +  1).  For 
the  case  where  s  ^  i  and  t  ^  j,  the  weights  A (s,t)  are  assigned  by  the  following 
exponential  function 

(228) 

where 

z(b  j)  =  exp  HFiJ  _  •  (2-29) 

seS  ,teT  '  ' 
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In  this  equation,  as  in  aniostropic  diffusion,  h  is  an  experimentally  determined  pa¬ 
rameter  that  controls  the  roll  off  of  the  exponential  function.  For  the  case  where 
s  =  i  and  t  =  j,  the  weight  A (s,t)  =  max{A(s,  t)\s  ^  i,t  ^  j}  is  used. 

Other  than  the  aforementioned  difference  in  similarity  measurement,  the  NLM 
algorithm  is  also  interesting  because  it  appears  in  [9]  to  provide  performance  that 
is  superior  to  the  algorithms  discussed  thus  far.  However,  it  also  has  several  facets 
which  make  it  suboptimal: 

•  No  analytic  method  is  presented  for  determining  the  parameter  h  used  to 
control  the  decay  of  the  exponential. 

•  Features  in  natural  images  are  frequently  similar  but  may  be  different  in  il¬ 
lumination  or  reflectivity.  The  NLM  algorithm’s  reliance  on  the  L2  norm  of 
the  error  between  F  and  G  does  not  exploit  these  highly  similar  regions  which 
could  be  separated  only  by  constant  bias  or  gain. 

•  The  algorithm  produces  results  that  contain  false  contouring  similar  to  that 
seen  when  image  pixels  are  sub-optimally  quantized. 

These  shortcomings  are  addressed  and  ameliorated  in  the  next  algorithm. 

2.5.5  Patch- Based  Denoising  with  Optimal  Spatial  Adaptation.  Kervrann 
and  Boulanger  [31,32]  describe  the  Optimal  Spatial  Adaptation  (OSA)  algorithm 
that  overcomes  some  of  the  shortcomings  of  the  NLM  algorithm.  They  describe  a 
block-based  algorithm  that  thresholds  the  Euclidian  distance  between  blocks  and 
builds  an  exponentially  weighted  average  of  similar  blocks.  Their  algorithm  elimi¬ 
nates  the  arbitrary  parameter  in  Buade’s  algorithm  by  employing  an  adaptively-sized 
window  around  a  given  pixel  to  find  an  optimal  neighborhood  size.  Like  the  NLM 
algorithm,  their  method  uses  an  exponential  weighting  of  candidate  blocks;  however, 
it  improves  on  Buade’s  method  by  eliminating  the  arbitrary  parameter  from  their 
algorithm  and  reducing  smoothing  artifacts.  Kervann  and  Boulanger’s  papers  also 
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describe  a  method  for  precisely  estimating  the  error  function  of  similar  regions  based 
on  the  chi-square  distribution. 

2.5.6  Other  Patch- Based  Methods.  Other  methods  have  been  proposed 
that  employ  matching  and  combine  blocks  using  different  measures  of  similarity. 
Dabov  et  al.  describe  a  method  for  block  matching  and  filtering  in  the  Fourier  domain 
[17].  Their  method  selects  similar  blocks  by  hard-thresholding  the  2-D  transforms  of 
two  blocks  and  then  examining  the  L2  distance  between  the  thesholded  transformed 
blocks.  They  then  create  a  three-dimensional  array  of  the  matching  blocks,  and 
perform  a  3-D  transform,  thresholding,  and  inverse  transform  operation  to  arrive  at 
denoised  estimates  of  the  matched  blocks. 

The  Unsupervised  Information-Theoretic  Adaptive  Filter  (UINTA)  proposed 
by  Awate  and  Whitaker  [1]  also  combines  information  from  blocks  within  an  image  in 
a  way  that  is  unique.  Their  algorithm  selects  a  random  sample  of  blocks  in  an  image 
and,  using  the  L2  distance  between  the  sampled  blocks  and  a  block  of  interest,  creates 
an  entropy  estimate  for  the  block  of  interest.  It  then  uses  an  iterative  gradient- 
descent  algorithm  to  reduce  the  entropy  of  the  center  pixel  in  the  block  of  interest. 
The  algorithm  iterates  until  the  variance  of  the  residual  error  between  the  denoised 
block  and  the  observed  block  equals  the  known  variance  of  the  noise.  Although 
this  algorithm  does  produce  good  results,  it  is  computationally  intensive  and  has  a 
tendency  to  over  smooth  image  content.  It  is  noted  to  work  best  on  images  that  are 
highly  periodic  in  content  [9]. 

2.6  Chapter  Summary 

This  chapter  has  provided  a  review  of  current  literature  on  image  registration 
and  denoising  that  yields  a  number  of  opportunities  for  research.  The  review  dis¬ 
cussed  potential  improvements  to  existing  registration  methods  that  may  be  used  to 
improve  image  registration  performance.  It  also  showed  an  opportunity  to  employ 
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the  Barankin  bound  to  extend  current  understanding  of  image  registration  algorithm 
performance.  It  introduced  methods  employing  Fourier  optics  that  will  be  used  to 
model  focal-length  errors  in  optical  systems.  Lastly,  it  provided  a  review  of  related 
nonlinear  image  Liters  that  focused  on  the  measures  of  similarity  they  use  to  identify 
similar  pixels  and  regions  within  images. 
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III.  Improving  Projection- Based  Image  Registration 


This  chapter  describes  a  method  for  accomplishing  fast,  reliable  image  registra¬ 
tion  that  may  be  implemented  with  a  low  computational  cost.  The  method 
described  extends  work  on  projection-based  image  registration  performed  using  cross 
correlations  described  first  by  Cain  et  al.  [11]  and  reviewed  in  Section  2.1.  The  goal 
of  this  chapter  is  to  provide  a  method  for  designing  a  low-pass  filter  that  minimizes 
the  mean-squared  error  of  registration  estimates. 

The  filters  are  designed  to  exploit  local  correlations  within  images.  Most 
naturally-occurring  images  captured  by  imaging  systems  exhibit  some  local  spatial 
correlation  [11]  and  a  variety  of  models  can  be  used  to  approximate  correlation  in 
images  [26].  This  spatial  correlation,  however,  is  not  guaranteed  and  is  not  neces¬ 
sarily  locally  consistent  within  an  image.  In  particular,  spatial  correlation  may  be 
absent  in  natural  images  of  free  space  and  in  images  that  are  under  sampled.  This 
chapter  demonstrates  that  if  significant  spatial  correlation  is  present  in  the  image 
projections,  the  performance  of  projection-based  image  registration  algorithms  can 
be  improved  significantly  by  the  application  of  simple  low-pass  filters. 

The  chapter  assumes  that  the  images  to  be  registered  are  wide-sense  stationary 
(WSS)  and  ergodic  which  allows  the  removal  of  spatial  indices  from  second  order 
statistics  in  the  evaluation  of  the  expected  value  of  an  image  using  a  spatial  average. 
In  practice,  many  images  have  unevenly  distributed  intensities  that  do  not  meet 
the  WSS  requirement  for  a  constant  mean.  Section  4.3  discusses  bias  reduction 
techniques  which  allow  the  use  of  a  WSS  assumption  even  when  the  intensities  are 
not  evenly  distributed. 

If  speed  is  a  concern,  a  simple  bias-reduction  filter  can  be  can  be  implemented 
optically  using  a  two-lens  system.  If  one  lens  is  defocused  so  it  acts  as  a  low- 
pass  filter,  the  difference  between  the  two  images  (or  two  image  projections)  is  the 
high-frequency  content  of  the  image.  This  arrangement  is  simulated  in  experiments 
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with  aerial  imagery  where  it  is  apparent  that  this  is  essentially  an  optical  edge 
detection  system.  This  chapter  does  not  attempt  a  detailed  description  of  the  design 
of  the  the  bias  reduction  portion  of  such  a  filtering  arrangement  except  to  note 
that  an  optical  filter  combined  with  the  type  of  filter  introduced  here  is  much  less 
computationally  complex  than  any  of  the  methods  described  in  the  literature  thus 
far. 

It  is  demonstrated  in  this  chapter  that  these  simple  and  computationally  inex¬ 
pensive  filters  can  be  used  to  achieve  effects  approaching  those  of  more  complicated 
Wiener  filters  that  need  to  be  implemented  using  floating-point  calculations. 

3.1  Improved  Projection-Based  Algorithm 

When  the  environment  is  expected  to  yield  images  with  significant  local  spatial 
correlation,  and  an  imaging  system  is  designed  to  provide  images  that  are  sampled 
sufficiently  to  detect  this  correlation,  these  facts  can  be  exploited  to  design  filters 
that  can  improve  the  performance  of  a  projection-based  shift  estimator. 

The  low-pass  filters  designed  in  this  chapter  are  convolutional  kernels  that  are 
applied  to  the  projections  of  two  images  prior  to  calculating  their  cross-correlation. 
Assuming  that  the  spatial  correlation  of  an  image  is  the  same  in  both  projections,  a 
filtering  kernel  of  length  w  can  be  determined  for  both  dimensions  as 

W—  1 

hm(A)  =  YX**-  n )•  (3.1) 

n= 0 

For  each  projection  dn^x  and  dn>2/,  the  filtered  projections  are  then  calculated 


f 

ln,x 

(3.2) 

f 

ln,y 

^-n,y  *  h-iLM 

(3.3) 
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For  imagery  exhibiting  spatial  correlation,  the  effect  of  the  aforementioned  filtering 
is  to  increase  the  effective  SNR  of  the  individual  projections.  In  order  to  mitigate 
the  effects  of  new  information  entering  the  scene,  one  of  the  projections  is  windowed 
using  the  method  described  in  [11],  Using  the  notation  iuy  to  indicate  a  vector 
with  all  elements  equal  to  the  mean  of  the  filtered  data  within  the  windowed  region, 
these  modified  projections  are  then  used  to  compute  the  1-D  cross  correlations  of 
the  projections  of  two  images  as 

p,  =  (f1,,-f1J*Wf(f2,?/-  W2>,),  (3.4) 

p:c  =  (fi,x  *  Wf(f2iI  -  Wff2)a.).  (3.5) 

In  terms  of  the  original  data,  these  cross  correlations  can  be  written  as 

P y  =  (dii?/  *  hw  —  wdlty)  *  Wf(d2)2/  *  h„,  —  wd2,y),  (3.6) 

P:c  =  (di,x  *  hw  -  wdi)X)  *  Wf(d2)X  *  hw  -  wd2,x).  (3.7) 

As  in  the  unfiltered  case,  the  shift  estimates  are  computed  from  these  projections  as 


a 

=  argmax  py(2)  , 

Z 

(3.8) 

P 

=  arg max  |px(z)|. 

(3.9) 

The  length  of  the  filter  w  is  chosen  by  using  a  FOM.  In  the  next  section  the  FOM 
introduced  in  [11]  is  modified  so  that  it  can  be  used  as  a  tool  to  design  a  low  pass 
filter  that  minimizes  the  MSE  of  registration  errors. 

3.1.1  Introduction  of  the  Revised  Figure  of  Merit.  Although  the  FOM  in¬ 
troduced  in  [11]  and  reviewed  in  Section  2.1  was  shown  to  be  effective  in  evaluating 
the  performance  of  both  2-D  cross  correlation  and  projection-based  registration  al¬ 
gorithms,  it  is  necessary  to  modify  it  for  use  with  the  new  algorithm.  The  problem 
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lies  in  measuring  the  variance  of  the  terms  in  the  denominators  of  (2.6)  and  (2.7).  In 
the  new  algorithm,  applying  a  convolutional  kernel  of  any  size  other  than  one  to  the 
projections  leads  to  a  1-D  cross  correlation  function  with  spatially  correlated  noise. 
Since  the  noise  is  no  longer  independent  and  identically  distributed  (i.i.d.)  in  the 
points  on  the  cross-correlation,  the  denominator  in  Cains  FOM  no  longer  accurately 
reflects  the  vertical  distance  between  two  points  on  the  cross  correlation.  Therefore 
the  modified  figures  of  merit  are  used  where  again,  the  notation  E[.]  indicates  the 
expected  value  of  a  random  variable  and  z  is  used  as  an  integer  index  for  the  cross 
correlation: 


FPy(z,a)  =  < 

f  (E[Py  (o)]  — E[py  (z)])2 

I  E[VAR[py(z)— py(a)|i]] 

i 

:  Z  fz  a 

(3.10) 

1  0 

:  else 

Fp.(z,P)  =  < 

f  (E[py(/3)]-E[Py(z)])2 

1  E[VAR[py(z)-py(/3)|i]] 

:  z  ±  (3 

(3.11) 

l  0 

:  else 

It  is  important  to  recognize  that  in  changing  the  FOM,  a  discontinuity  has  been 
introduced  at  the  point  z  =  a  which  is  accounted  for  by  assigning  a  value  of  zero  at 
the  point  of  discontinuity. 

3.1.2  Use  of  the  FOM  in  Filter  Design.  The  correlation-based  image 
registration  involves  searching  an  area  of  interest  within  a  frame  for  content  that  is 
identical  to  a  previous  frame.  In  the  region  that  this  content  is  identical,  the  cross 
correlation  approximates  the  autocorrelation  of  the  region  of  interest.  Thus,  for  any 
given  projection,  the  point  of  primary  interest  is  the  apex  of  the  autocorrelation 
which  occurs  at  a  —  0  and  indicates  the  most  likely  estimate  of  the  point  where  the 
two  frames  are  aligned.  Using  the  notation  (  ,  )  to  denote  the  inner  product  of  two 
vectors,  if  the  true  shift  between  the  images  is  zero,  then  for  any  shift  value  a  an 
error  occurs  when  (d^,,,  d2,j/)  <  (dii2/,  TQd2y).  For  most  common  correlation  models, 
for  a  /  0  the  L2  distance  between  d1)3/  and  TQ,d2  y  is  be  smallest  in  expectation  at 
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a  =  —  1  and  a  =  +1.  Therefore,  for  an  unbiased  estimator,  and  a  symmetric 
autocorrelation  model,  autocorrelation  points  at  a  =  +1  and  a  =  —1  are  the  points 
that  are  most  likely  to  cause  a  registration  error  (i.e.  these  are  normally  the  points 
closest  in  magnitude  to  the  peak  of  the  autocorrelation.)  This  approach  is  similar 
to  the  process  of  finding  test  points  for  calculation  of  the  Barankin  bound  on  a 
registration  estimate  [37]  and  furthermore,  the  choice  of  a  =  +1  and  a  =  —1  does 
imply  some  amount  of  a  priori  knowledge  of  the  structure  of  the  signal  PSD.  However, 
this  correlation  structure  is  common  to  most  natural  images  except  those  with  flat 
or  impulse-shaped  autocorrelations. 

The  consequence  of  this  conclusion  is  that  the  design  of  a  filter  can  be  ap¬ 
proached  by  attempting  to  minimize  the  most  probable  registration  errors.  This  is 
done  by  deriving  a  FOM  for  the  filtered  projections  that  measures  the  effective  SNR 
between  shifts  of  a  =  0  and  a  =  —1.  Using  the  assumption  that  noise  is  i.i.d.  in 
each  pixel,  it  can  be  shown  that  if  the  projections  are  filtered  with  a  kernel  of  size 
w  and  a  windowing  function  of  size  L,  the  FOM  for  the  filtered  projections  of  an 
N  x  N  image  is 

(LN2  (VAR(I)  -  COVz(I)))2 
w2a2LN2( 2(VAR(I)  -  COVz(I))  +  a2)' 

The  derivation  for  3.12  is  included  as  an  appendix  in  Sec.  A.l.  With  an  analytic 
expression  for  Fp  given  by  (3.12),  an  exhaustive  search  of  values  can  be  performed 
over  a  reasonable  range  (e.g.  [1,  20])  to  find  a  value  for  w  that  maximizes  the  FOM. 
By  maximizing  the  FOM  with  this  value,  the  most  probable  registration  errors  can 
be  minimized,  thereby  minimizing  the  MSE  of  the  registration  errors  overall.  The 
optimal  filter  size  can  then  be  written  as 

Wy,oPt  =  arg  max  Fp  (0,  -1) .  (3. 13) 

W 


(3.12) 


FpM  1) 
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3.2  Observing  Covariance  in  Images 


In  order  to  employ  (3.12),  it  is  necessary  to  have  a  mechanism  for  estimating  the 
average  covariance,  COVz(I|a)  described  in  Appendix  B.  This  section  describes  how 
this  value  can  be  estimated  in  an  image  corrupted  by  noise.  The  effects  of  biases  on 
these  measurements  are  also  discussed  along  with  suggested  methods  for  minimizing 
the  effects  of  these  biases.  The  covariance  measured  and  graphed  in  Figure  3.2  was 


Figure  3.1:  1024  x  1024  grayscale 

image  of  the  Pentagon  from 
http://sipi.usc.edu/database/. 


Figure  3.2:  Measured  covariance  of 

the  column  projections  of  the  Pentagon 
image  in  Figure  3.1. 


calculated  by  taking  the  inner  product  of  a  projection  and  a  circularly  shifted  version 
of  the  projection  for  the  image  shown  in  Figure  3.1  in  the  absence  of  noise.  If  the 
covariance  model  is  known  a  priori ,  this  data  in  can  be  used  in  calculations;  however, 
for  many  applications,  this  information  is  not  available.  When  noise  is  uncorrelated, 
the  available  data  can  be  used  to  estimate  a  working  covariance  model  using  only 
very  basic  calculations. 

When  uncorrelated  noise  is  added  to  the  image,  the  magnitude  of  the  center  of 
the  covariance  plot  (corresponding  to  the  variance  of  the  noise)  increases.  However, 
the  effect  on  the  off-center  values  of  the  covariance  function  is  much  less  pronounced. 
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This  effect  is  shown  pictorially  in  Figs.  3.3  and  3.4  where  the  covariance  models  have 
been  calculated  by  taking  the  inner  product  of  a  projection  within  the  windowed  area 
with  a  circularly  shifted  version  of  itself.  Although  a  variety  of  techniques  could  be 
employed  to  estimate  the  optimal  covariance  model  from  the  noisy  covariance  model, 
a  crude  but  usually  effective  method  is  shown  here.  The  vector  of  values  representing 
the  covariance  model  is  called  C,  and  the  index  of  the  the  midpoint  of  C  is  0  then 
replace  C(0)  with 


C(0)  =  C(— 1)  + 


|C(— 1)  —  C(— 2)| 
2 


(3.14) 


The  factor  of  one  half  in  (3.14)  gives  a  very  conservative  estimate  for  the  location 
of  the  peak  which  becomes  important  in  the  design  a  filter  that  minimizes  the  MSE 
of  the  registration  estimate.  At  this  point,  a  variety  of  filtering  techniques  could 
be  employed  to  further  smooth  the  covariance  estimate.  For  example,  when  the 
observed  scene  is  expected  to  change  little  between  frames,  a  covariance  model  could 
be  developed  as  an  average  of  covariance  models  over  many  frames  of  data.  However, 
for  simplicity,  results  are  shown  using  a  covariance  model  derived  from  a  single  frame 
of  data.  With  this  covariance  model,  a  filter  can  be  designed  for  a  given  image  that 
minimizes  registration  errors  using  (3.13). 


3.3  Experimental  Results 

The  algorithm  was  tested  using  a  variety  of  standard  test  images  and  also 
with  a  series  of  images  from  an  aerial  imaging  platform.  Using  standard  images,  the 
operation  of  the  algorithm  was  verified  with  images  that  were  heavily  corrupted  with 
AWGN.  Using  a  sequence  of  frames  with  more  typical  noise  values,  an  examination 
of  how  the  algorithm  would  work  under  more  common  conditions  such  as  those 
frequently  encountered  with  infrared  imaging  systems  was  performed. 
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Figure  3.3:  512  x  512 

Brodatz  grass  image  from 
http://sipi.usc.edu/database/. 


Figure  3.4:  Graph  of  measured  co- 

variance  of  column  projections  of  the 
grass  image  in  Figure  3.3  calculated 
without  noise  and  again  with  AWGN  of 
a  =  100  (PSNR  =  8.12). 


In  a  first  set  of  experiments,  the  algorithm  was  tested  using  a  variety  of  images 
that  were  severely  corrupted  by  adding  AWGN  to  two  frames  with  a  =  100  (PSNR 
=  8.14).  Although  this  type  of  SNR  is  not  normally  encountered  in  high- illumination 
imaging,  it  is  not  unusual  for  LADAR,  or  passive  infrared  systems  under  examination 
for  military  applications.  For  test  purposes,  this  level  of  corruption  also  produces 
errors  in  sufficient  quantities  for  meaningful  analysis. 

Using  these  corrupted  images,  the  improvement  in  registration  accuracy  at¬ 
tained  by  the  registration  algorithm  was  measured  by  estimating  shifts  for  two  frames 
when  the  actual  shift  was  zero.  The  correlation  function  for  the  images  was  mea¬ 
sured  by  circularly  shifting  windowed,  noise-free  projections  of  the  images  and  also 
by  using  a  single  noisy  frame  of  data  and  the  adjustment  in  (3.14).  In  an  actual 
imaging  system,  an  expected  correlation  model  may  also  be  assumed  based  on  a 
priori  knowledge  of  the  given  imaging  system  and  likely  observations. 
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Figure  3.5:  1024  x  1024  Bro- 

datz  sand  image  from  from 
http://sipi.usc.edu/database/. 


Figure  3.6:  Graph  of  measured  co- 

variance  of  column  projections  of  the 
sand  image  in  Figure  3.5  calculated 
without  noise  and  again  with  AWGN  of 
cr  =  100  (PSNR  =  8.13).  Although  an 
offset  is  evident  between  the  two  cases, 
this  does  not  affect  the  results  of  the 
calculations. 


35 


Figure  3.7:  1024  x  1024 

Brodatz  water  image  from 
http://sipi.usc.edu/database/. 


Figure  3.8:  Graph  of  measured  co- 

variance  of  column  projections  of  the 
image  in  Figure  3.7  calculated  without 
noise  and  again  with  AWGN  of  a  =  100 
(PSNR  =  8.13).  In  this  graph  the  peak 
has  been  adjusted  using  (3.14). 
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Image 

MSE 

unfiltered 

MSE 

F'Py  ,max 

MSE 

Wiener 

MSE 

^ 'Py  ,max 

noisy 

MSEmiri 

w 

F'Py  ,max 

w 

F Py  ,max 

noisy 

w 

MS  Emin 

Pentagon  (Figure  4.1) 

1.42 

0.384 

0.381 

0.384 

0.340 

n 

ii 

9 

Grass  (Figure  3.3) 

0.284 

0.112 

0.101 

0.127 

0.112 

3 

2 

3 

Water  (Figure  3.7) 

0.557 

0.084 

0.064 

0.102 

0.084 

5 

4 

5 

Sand  (Figure  3.5) 

0.545 

0.135 

0.119 

0.195 

0.080 

7 

10 

5 

Tank  (Figure  3.15) 

6.459 

3.90 

1.945 

3.90 

1.850 

2 

2 

12 

Table  3.1:  Measured  and  predicted  results  of  the  improved  registration  algorithm. 


In  the  following  discussion  and  graphs,  the  degree  of  noise  in  an  image  is 
measured  using  Peak  Signal-to- Noise  Ratio  (PSNR)  where  the  PSNR  of  an  S  x  T 
sized  image  is  defined  as: 


PSNR 


10  log 


(3.15) 


where  I  is  the  diffraction-limited  image,  I  is  the  corrupted  version  of  the  image,  and 
||...||i?  is  the  Frobenins  norm  which  is  defined  as  the  square  root  of  the  sum  of  the 
squares  of  the  elements  of  a  matrix.  The  test  images  studied  were  256-level  gray 
scale  so  2552  was  used  uniformly  as  the  maximum  pixel  value  in  the  numerator. 

For  each  image  the  algorithm  evaluated  kernel  sizes  from  1  to  30.  1000  pairs 
of  frames  were  produced  (i.e.  a  total  of  30,000  trials)  for  each  image  and  the  shift 
was  estimated  from  the  projections  of  the  two  frames.  As  expected,  the  results  were 
dependent  upon  the  covariance  functions  of  the  images  studied. 

For  comparison  purposes,  results  were  also  obtained  by  using  the  optimal  low- 
pass  filter  of  the  projections  as  computed  by  Wiener  filtering.  For  the  images  studied, 
the  inverse  Fourier  transforms  of  these  images  were  approximately  triangular  or  sinc- 
like.  Representative  filtering  kernels  for  column  projections  of  the  Pentagon  and 
Brodatz  water  images  are  shown  in  Figure  3.13  and  Figure  3.14.  MSEs  resulting 
from  filtering  using  these  kernels  are  found  in  Table  4.1. 

The  results  are  summarized  in  Table  4.1.  As  shown  in  this  tabic,  an  improve¬ 
ment  in  the  MSE  of  the  registration  estimate  was  demonstrated  for  all  of  the  images 
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considered.  The  results  also  showed  that  for  most  images,  the  shape  of  covariance 
function  could  be  estimated  to  useable  accuracy  using  only  the  covariance  measured 
from  the  noisy  data  and  (3.14).  Using  an  average  covariance  function  developed 
from  multiple  frames  of  data,  it  is  expected  that  results  could  approach  the  optimal 
predicted  results. 

In  the  case  of  the  Pentagon  image  of  Figure  4.1,  using  a  noise-free  estimate  of 
the  covariance  function,  a  filter  size  of  11  was  predicted  to  produce  registration  esti¬ 
mates  with  minimum  MSE.  In  fact,  this  value  improves  the  MSE  of  the  registration 
estimate  by  a  factor  of  3.7.  Using  measured  data,  the  optimal  filter  size  is  indicated 
to  be  9  and  improves  the  MSE  of  the  registration  estimate  by  a  factor  of  4.18. 

Results  for  the  textured  images  like  the  grass,  sand  and  water  images  in  Figs. 
3.3,  3.5,  and  3.7  were  the  most  accurate  of  the  images  examined  as  these  images  were 
the  closest  to  being  wide-sense  stationary  without  modification.  In  the  grass  and 
water  images,  minimum  MSEs  occurred  at  kernel  sizes  that  were  predicted  using 
noise-free  estimates  of  the  covariance  functions.  In  the  sand  image  the  noise-free 
estimate  of  the  optimal  kernel  size  was  7  and  the  actual  optimal  kernel  size  was  5. 
Using  noisy  estimates  of  the  covariance  functions  for  these  images,  it  was  possible  to 
improve  the  MSEs  of  the  registration  estimates  by  factors  of  2.5  to  5.5  over  estimates 
using  unfiltered  projections.  Estimates  developed  using  noise-free  data  improved  the 
MSEs  of  the  registration  estimates  by  factors  of  2.5  to  6.6. 

Two  main  sources  of  error  were  evident  in  the  experiments.  A  primary  source 
of  error  was  local  image  bias.  A  model  of  spatial  correlation  developed  using  circular 
shifting  of  a  windowed  image  incurs  bias  at  shifts  other  than  zero  as  new  information 
enters  the  window.  In  the  experiments,  bias  was  manifested  as  a  difference  between 
the  covariance  function  obtained  using  circular  shifting  and  the  covariance  function 
obtained  by  actually  shifting  the  windowed  portion  of  the  image  into  regions  out¬ 
side  the  image.  This  bias  helps  to  explain  differences  between  the  minimum  MSE 
predicted  using  the  covariance  function  and  the  measured  minimum  MSE.  Bias  was 
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most  evident  in  those  images,  like  the  Pentagon  image,  that  had  significant  local¬ 
ized  feature  content.  Images  without  significant  bias  tended  to  be  images  of  regular 
patterns  such  as  the  textured  images  in  Figs.  3.3,  3.5,  and  3.7. 

A  second  source  of  error  between  kernel  size  and  lowest  MSE  was  the  way 
the  estimated  shift  was  calculated.  The  algorithm  is  designed  to  minimize  errors 
occurring  at  a  of  -1  or  1  which  was  the  shift  most  likely  to  produce  an  error.  However, 
statistically,  errors  produced  by  other  shifts  that  contribute  to  MSE  can  be  expected. 
As  the  covariance  function  of  an  image  becomes  flatter,  errors  contributed  by  shifts 
other  than  -1  and  1  makes  up  a  larger  proportion  of  the  total  errors  contributing  to 
the  MSE.  A  more  complete  model  could  account  for  these  additional  shifts. 

Estimates  of  Fp  were,  in  general,  less  reliable  for  small  kernel  sizes  (i.e.  less 
than  approximately  four.)  This  error  was  attributable  to  errors  in  estimating  the 
peak  value  of  the  covariance  function  and  to  differences  between  actual  data  values 
and  the  averaged  data  values  used  in  the  covariance  functions. 

A  final  significant  observation  involved  images  like  the  one  shown  in  Fig¬ 
ure  3.15.  The  covariance  function  for  this  image  is  shown  in  Figure  3.16.  The 
covariance  function  for  this  image  is  much  flatter  than  the  others  shown  in  this  sec¬ 
tion.  Although  there  is  a  significant  amount  of  covariance  present  in  the  image,  the 
slight  slope  of  the  covariance  leads  to  a  small  value  in  the  numerator  of  (3.12).  As 
demonstrated  by  the  graph  of  Figure  3.17,  this  image  has  low  values  for  Fp  com¬ 
pared  with  other  images  in  this  section  and  filtering  does  not  appreciably  improve 
the  MSE  of  projection-based  image  registration  for  this  image  nor  does  it  change 
the  FOM  to  the  degree  evident  in  the  other  images.  This  suggests  that  algorithm 
performance  is  dependent  not  only  on  the  magnitude  of  the  covariance  function  but 
is  perhaps  also  dependent  on  the  second  derivative  of  the  covariance  function.  This 
relationship  will  be  explored  in  future  research. 

In  a  second  set  of  experiments,  the  operation  of  the  algorithm  with  a  series  of 
images  taken  by  an  actual  aerial  imaging  system  was  examined.  In  these  images, 
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a  two-lens  bias  reduction  system  was  simulated.  Before  registration,  the  images 
were  preprocessed  to  detect  and  remove  significant  local  biases  caused  by  uneven 
illumination  and  glare;  however,  high-frequency  content  was  left  intact.  For  the 
types  of  illumination  differences  in  the  images,  this  was  achieved  by  using  very  coarse 
low-pass  filtering  to  identify  local  image  intensities,  subtracting  the  result  from  the 
original  image,  and  then  adjusting  the  image  mean  to  the  mean  pixel  value  for  the 
ensemble  of  images.  To  this  result  AWGN  was  added  with  a  —  20  and  the  ability 
to  predict  kernel  size  between  similar  frames  was  examined.  Many  frames  included 
significant  homogeneous  texture  with  some  minor  features  such  as  shadows,  roads, 
drainage  ditches  and  fences. 

One  frame  was  selected  and  a  Monte  Carlo  simulation  was  performed  that 
measured  the  MSE  for  kernel  sizes  from  1  to  15  for  100  trials  each  (1500  total  trials). 
For  this  frame  the  calculated  optimal  kernel  size  was  4  which  yielded  an  MSE  of 
zero.  This  result  is  shown  graphically  in  Figure  3.19.  Using  data  that  included  the 
AWGN,  verification  was  performed  to  confirm  that  FOM  calculations  predicted  an 
optimal  kernel  size  of  four.  Other  images  in  the  series  were  examined  to  determine 
whether  FOM  calculations  performed  using  one  frame  could  be  used  in  other  frames 
in  the  series. 

A  frame  from  the  series  with  similar  but  different  image  content  is  Figure  3.20. 
Calculations  of  kernel  size  for  Figs.  3.18  and  3.20  are  shown  in  Figure  3.21.  In  a  series 
of  56  images,  that  the  mean  calculated  kernel  size  was  3.6  with  a  standard  deviation 
of  1.89  and,  as  expected,  those  images  that  had  the  most  similar  content  were  the 
most  likely  to  have  Fp  that  were  closely  matched  in  shape  if  not  in  magnitude. 

3-4  Chapter  Summary 

This  chapter  has  provided  a  method  for  improving  the  performance  of  projection- 
based  image  registration  algorithms  at  minimal  computational  cost.  It  also  explains 
how  low-pass  filtering  can  exploit  spatial  correlations  to  improve  the  performance  of 
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image  registration  algorithms.  The  preceding  sections  have  described  an  algorithm 
that  improves  the  performance  of  current  projection-based  image  registration  and 
have  described  methods  for  choosing  optimal  parameters  for  the  algorithm  based  on 
measured  data  from  the  images  being  registered.  They  have  also  described  experi¬ 
ments  conducted  with  actual  test  data  that  have  confirmed  the  analytical  results. 

Transform-domain  operations  provide  one  mechanism  for  registering  images 
that  are  not  only  translated  but  also  scaled  or  rotated;  however,  changes  in  scale  or 
rotation  may  also  be  detected  and  accounted  for  spatially.  Use  of  the  filtering  method 
described  in  this  chapter  with  dilated  or  rotated  images  is  one  possible  extension  to 
this  research. 

The  correlation  theory  contained  in  this  chapter  may  also  be  applied  to  a  host 
of  other  applications.  An  obvious  extension  to  the  work  contained  in  this  paper  is 
studying  the  effect  that  filtering  has  on  two-dimensional  correlation  problems. 
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Figure  3.9:  Calculated  FPyj  and  MSE 
for  the  Pentagon  image  in  Figure  4.1 
with  u noise  =  100,  actual  shift  =  0.  Note 
the  skewing  of  the  estimates  of  FPy: 
caused  by  estimation  errors  at  the  peak 
covariance  value.  Errors  are  less  evident 
with  additional  averaging  at  larger  ker¬ 
nel  sizes. 


Figure  3.10:  Calculated  FPy  and 

MSE  for  the  grass  image  in  Figure  3.3 
calculated  without  noise  and  again  with 
AWGN  of  <7  =  100. 


Figure  3.11:  Calculated  FP  and 

MSE  for  the  sand  image  in  Figure  3.5 
with  a noise  =  100,  actual  shift  =  0. 


Figure  3.12:  Calculated  FP  and 

MSE  for  the  water  image  in  Figure  3.7 
with  (7 noise  =  100,  actual  shift  =  0. 
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Figure  3.13:  Optimal  spatial-domain 
filtering  kernel  for  the  Pentagon  image 
in  Figure  4.1. This  is  the  Wiener  filter 
which  is  optimized  to  minimize  the  MSE 
of  the  filtered  image. 


Figure  3.14:  Optimal  spatial-domain 
filtering  kernel  for  the  Brodatz  water 
image  in  Figure  3.7.  This  is  the  Wiener 
filter  which  is  optimized  to  minimize  the 
MSE  of  the  filtered  image. 
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MSE  or  F 


Figure  3.15:  512  x  512  Tank  image  Figure  3.16:  Calculated  covariance 

from  http://sipi.usc.edu/database/.  function  for  the  512  x  512  Tank  image. 

Note  that  these  covariance  functions  are 
more  linear  than  others  shown  in  this 
chapter. 


kernel  size 

Figure  3.17:  Calculated  Fpyj  and 

MSE  for  the  tank  image  in  Figure  3.15 
with  AWGN  of  a  =  100. 
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Figure  3.18:  Aerial  image  of  cornfield 
from  a  sequential  series  of  frames. 


Figure  3.19:  Graph  of  kernel  vs  MSE 
for  the  image  of  Figure  3.18  with  AWGN 
of  a  =  20  when  measuring  an  actual 
shift  of  zero.  The  optimal  kernel  size 
for  this  image  is  shown  to  be  four. 


Figure  3.20:  An  aerial  image  of  a  road 
taken  from  the  series. 


Figure  3.21:  Comparison  of  the  calcu¬ 
lated  Fp  for  the  images  in  Figs.  3.18 
and  3.20.  Note  that  both  Fp  peak  at 
Kernel  Size  =  4. 
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IV.  Bounds  on  Image  Registration  Algorithms 


This  chapter  examines  the  limits  on  ability  to  register  images  using  techniques 
like  the  one  described  in  Chapter  III.  This  examination  is  performed  using 
theoretical  bounds  on  image  registration  algorithms  and  the  effects  of  filtering  on 
these  algorithms.  In  this  chapter,  the  examination  of  bounds  on  image  registration 
performance  is  extended  by  applying  the  Barankin  bound  to  better  account  for  the 
effects  of  projecting  and  filtering  images  than  the  CRLB.  The  chapter  also  describes 
bounds  on  registration  in  the  presence  of  optical  focal-length  errors.  Bounds  on 
the  estimates  of  the  motion  of  objects  pictured  within  a  frame  are  also  discussed 
since  these  are  an  important  part  of  target  tracking.  Where  lengthy  derivations  are 
required,  these  are  provided  in  Appendix  A  of  the  dissertation. 

4.1  Performance  Bounds  on  Image  Registration  With  Filtered  Projections 

This  section  describes  the  bounds  on  the  registration  of  a  1-D  projection  of 
an  image.  As  described  in  Chapter  III,  if  an  imaging  sensor  is  used  primarily  for 
motion  estimation,  image  projections  offer  what  is  perhaps  the  fastest  approach  for 
registering  available  images.  The  reduction  in  computational  complexity  realized 
by  using  projections  for  motion  estimation  comes  at  a  cost  in  accuracy.  While 
some  literature  cites  this  cost  as  “minimal”  [45],  this  section  provides  a  derivation 
of  equations  that  can  be  used  to  quantify  the  theoretical  limits  on  the  accuracy  of 
motion  estimates  derived  using  projection-based  methods.  This  derivation  is  similar 
to  the  one  for  the  general  CRLB  found  in  [29]. 

The  following  assumes  that  images  are  periodic  and  band-limited,  and  that  a 
general  filtering  kernel  H  can  be  defined  which  is  a  positive-definite,  circulant  matrix. 
Tq  is  circulant  and  circulant  matrices  commute,  hence 

(TQHi)(x)  =  (HTQi)(x)  (4.1) 
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An  operator  T)z  is  also  defined  which  works  as  a  convolutional  differencing  operator 
across  the  dimension  z.  T)z  is  also  circulant  and  commutes  with  T„  and  H. 

The  derivation  begins  by  describing  the  probability  distribution  function(PDF  ) 
of  any  pixel  in  the  image  as  [53] 


p(D(x,y)\l(x,y))  = 


V^7Ti 


exp 


ncr 


2  a2 


]D (x,y)  -  I {x,y)f 


The  PDF  of  a  point  in  the  y  projection  of  the  image  can  be  written  as 


(4.2) 


p(di,2/(x)|i2/(x))  = 


\/2irNcr 


exp 


2cr2N 


(di Jx)  -  Ux)f 


(4.3) 


and  the  PDF  of  the  entire  projection  can  be  written  as 


p(dl"'|i»)  =  (ot) 
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exp 


1 


2a2N 


(di,j/  d)  (dil2/  i y 


(4.4) 


If  fji2/  is  defined  to  be  the  projection  of  an  image  filtered  by  H.the  PDF  of  the  filtered 
projection  can  be  written  as  the  linear  transform 


p(fljy  =  (ot) 


N 


exp 


2a2N 


(fi, 


Hi„)TW-1(f1, 


y 


Hi, 


(4.5) 


where  cr2N W  is  the  covariance  matrix  of  the  projected  and  filtered  noise  and  W  = 
HHT  =  HTH.  Similarly,  if  there  is  a  second  image  which  is  identical  to  the  first 
except  for  a  horizontal  shift  a,  the  PDF  of  this  projection  can  be  defined  as 


P(,^2,y  |  D )  Q^) 


\/2i tNct 


N 


exp 


2a2N 
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Combining  (4.5)  and  (4.6)  to  find  the  joint  probability  of  and  f2)2/  yields 


2ixNo2 
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exp 


2a2N 


((f1,s-His)TW-1(fi,„-His 


+(f2,„  -  HT„i„)TW-1(f2,„  -  HT„i„)) 


(4.7) 
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4-1.1  The  CRLB  of  Registration  Using  Image  Projection.  The  examination 
on  bounds  begins  with  the  CRLB.  Following  the  approach  used  in  [48],  (4.7)  can  be 
used  to  create  a  block  FIM  J  of  the  form: 


J  = 


(4.8) 


In  Appendix  A. 2.1  it  is  shown  that  under  the  assumption  of  circulant  filtering  ma¬ 
trices,  the  Fisher  information  matrix(FIM)  for  the  filtered  projection  is  identical  to 
the  FIM  for  the  original  data.  Consequently,  the  FIM  can  be  defined  in  terms  of  the 
unfiltered  projections  as 
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(4.9) 

(4.10) 

(4.11) 


As  shown  in  Appendix  A. 2.1,  the  FIM  can  be  inverted  using  block  matrix  techniques 
to  arrive  at  the  result 


VAR  (a)  >  2a2N[\\Dxiy\\2]  \  (4.12) 

This  bound  derivation  using  the  nuisance  parameters  differs  from  the  one  derived 
in  [46]  by  a  factor  of  two.  This  result  is  apparent  for  unfiltered  images  from  work 
in  [48]  and  is  suggested  by  work  in  [55],  but  is  not  explicitly  stated  in  either  document. 

4-1.2  The  Barankin  Bound  on  Registration  Using  Projections.  In  the  pre¬ 
vious  subsection,  the  filtering  terms  dropped  out  of  the  derived  bounds.  This  is  also 
the  case  for  the  Barankin  bound  so  the  bounds  are  calculated  using  the  unfiltered 
PDF  of  the  image.  For  this  case,  the  FIM  calculated  using  (4.8)  is  used  as  J,  the 
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FIM  calculated  as  in  the  CRLB,  and 


L(di,  d2  @j,  0O)  =  exp 

2^-N^2  1>')T(d 

=  exp 
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T„1i„)T(d2  -  'I'  i,,)]  , 
(4.13) 


where  ©G  =  iy  is  the  true  value  of  a  projected  image  and  ©;  =  is  a  projected 

image  shifted  by  Ta..  From  (4.13).  it  follows  that 
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The  integer  shifts  of  aq  =  —1  and  CK2  =  1  are  used  as  the  most  likely  registration 
errors  and  the  Barankin  bound  is  calculated  using  the  equations  derived  in  [37] 
and  [38].  The  final  terms  required  to  calculate  the  bound  for  the  given  conditions 
are 


B  ij  =  exp 


= 


a2N  L 
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This  yields  the  following  form  of  the  Barankin  bound, 


E[(©  —  ©)2]  =  + 


(4.15) 

(4.16) 


(4.17) 


Because  the  bounds  are  clearly  image  dependent,  numeric  methods  are  em¬ 
ployed.  Results  calculated  using  this  bound  are  found  in  Section  4.3. 
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4-2  Bounds  of  Two  Dimensional  Image  Registration  With  Filtered  Images 

Image  registration  using  all  of  the  2-D  spatial  information  available  in  the  image 
is  much  more  robust  than  registration  using  projections  alone.  This  section  derives 
these  bounds  and  also  examines  the  case  where  a  pre-detection  image  has  been 
corrupted  by  a  focal-length  error  which  is  modeled  as  an  optical  filter.  Consequently, 
in  the  two-dimensional  case,  not  only  is  the  derivation  slightly  more  mathematically 
complicated,  but  there  are  also  different  filtering  scenarios  that  can  be  accounted  for 
and  different  bounds  that  can  be  examined.  In  the  two-dimensional  case,  these  two 
significant  filtering  scenarios  are: 

1.  Optical  filtering  -  filtering  is  performed  in  the  optics  before  sensor  noise  is 
added  and 

2.  Post-detection  filtering  -  filtering  is  performed  on  an  image  after  sensor  read¬ 
out. 

These  two  cases  are  derived  jointly  and  the  notation  Hp  is  used  to  denote  post¬ 
detection  filtering  and  H0  is  used  to  denote  optical  filtering.  Since  this  derivation 
is  similar  to  the  one-dimensional  case,  it  is  included  as  an  appendix;  however,  the 
results  are  summarized  in  the  following  subsections. 

4-2.1  2-D  CRLB  with  Optical  Filtering.  The  registration  of  two  frames 

of  data,  Dx  and  D2  as  defined  in  (1.8)  and  (1.9)  is  examined  first.  The  noise  in 
these  frames  (Qi  and  Q2)  is  assumed  to  be  spatially  and  temporally  uncorrelated 
and  Gaussian.  This  noise  model  typifies  fixed  pattern  noise  and  read  noise  in  the 
readout  amplifier  of  a  CCD  that  are  typically  the  dominant  noise  sources  in  very- 
low- intensity  images  [20],  [25].  The  natural  logarithm  of  the  joint  probability  of  these 
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two  frames  represented  as  vectors  (1.8)  and  (1.9)  can  be  written  as 


MD^Dall) 


1 

2lt<T2 


(Di- 


+ constant, 


HpH0lfW  '(Dx-HpHoI) 
HpH0Ta)/3I)TW"1(D2  -  HpH0Ta  /3I) 

(4.18) 


where  now  W  =  HpHpT  =  Hp7  Hp.  Using  the  derivation  found  in  Appendix  Section 
A. 2. 2,  the  CRLB  for  the  2-D  case  is  found  to  be 


VAR(a)  > 
VAR(/3)  > 


_ 2a2||D,H0I||2 _ 

||^:H0I||2||DyH0I||2  -  (2)XH0I,  DyH0I)2’ 

_ 2a2||D,rH0I||2 _ 

||D,:H0I||2||2)yH0I||2  -  (DXH0I.  DyH0I)2’ 


(4.19) 

(4.20) 


where  ||.||2  is  the  square  of  the  L2  norm  of  a  vector,  (  ,  )  is  the  inner  product  of  two 
vectors,  and  I  is  a  vectorized  version  of  a  2-D  image.  In  this  case,  as  the  size  of  the 
filtering  kernel  increases,  the  magnitude  of  the  the  terms  ||l>rH0I||2  and  ||DyH0I||2 
decreases.  Intuitively,  as  resolution  is  lost  in  the  image,  it  becomes  more  difficult  to 
register.  This,  however,  is  not  the  case  for  filtering  performed  after  detection  since  Hp 
has  dropped  out  of  (4.19)  and  (4.20).  It  is  interesting  to  note  that  although  filtering 
images  has  been  shown  to  improve  the  performance  of  correlation  and  gradient-based 
image  registration  [4],  [39],  [46],  [47],  the  type  of  post-detection  filtering  performed 
in  these  papers  does  not  improve  the  CRLB.  Rather,  post-detection  filtering  is  a 
part  of  the  estimation  process  that  may  yield  performance  approaching  the  bound. 


4-2.2  2-D  Barankin  Bound.  Looking  at  the  two-dimensional  case  for  optical 
filtering,  J  is  again  the  (N  +  2)  x  (N  +  2)  FIM  as  derived  for  the  CRLB.  If  the  most 
likely  errors  are  expanded  to  include  shift  vectors  of 


KA]e{[io],[-io],[o  l],  [o  -l]}, 
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then  the  matrix  $  can  be  constructed  as 


$  = 


-1 

0 


1 

0 


0 

-1 


y  TQ,j(gH0I|Q,=_2  yj=Q  Ta )igH0I|Q,=^ijg=Q  Ta  £jHoI|Q=0 


0 
1 

Tq^HqI  |  q/=o,/3=i  J 

(4.21) 


In  this  case  the  likelihood  function  is  calculated  as 


L(Di,  D2|@j,  @0)  — 


p(Di,D2|H0Tai<9I,Qj,^) 
p(Di,  D2|H0Ta,)(gI.  a  =  0,/?  =  0) 


=  exp 


^D^(H0TaiiAI-H0I) 


(4.22) 


Beginning  with  (A. 20),  the  partial  derivatives  of  the  log- likelihood  functions  are 


<91np(Di,D2|I,a,/3) 

da 

<91np(Di,D2|I,a,/3) 

op 

d  lnp(Di,  D2|I,  a,  (3) 

dl 


(D2  -  h0tQ)A[)t 


0‘ 

1 


<7 


2  \D2  —  H0TaiA[ 


gHoT^I 

da 


T<9H0Tai/3I 


dp 


-  [Hr(D!  -  H0I)  +  (Ta  /jH0)t(D2  -  H0TQi/3I 

tjZ 


It  is  easy  to  see  that,  as  with  the  one  dimensional  case,  AtJ  =  0  V  i,  j.  Finally, 


B  ij  =  exp 


;hgi  -  H0TQi,AI)T  (H0I  -  h0tq.  Ai) 


(4.23) 


with  which  the  Barankin  bound  can  be  calculated  numerically  as  a  function  of  the 
CRLB  from  (A. 34), 


E[(0  —  ©)2]  >  J1  +  <f>B~1<f>T. 


(4.24) 


Results  for  optically-hltered  images  calculated  using  this  bound  are  found  in  Section 
4.3. 
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4-3  Experimental  Results 

To  examine  the  effect  of  the  calculated  bounds  on  typical  images,  bounds  were 
calculated  and  examined  for  a  standard  image  and  for  an  ensemble  of  frames  of 
LIDAR  data.  Bounds  were  calculated  numerically  for  different  image  sizes,  different 
image  intensities  and  for  varying  registration  methods.  Bounds  were  specifically 
examined  for  the  registration  of  images  when  using  projections  of  the  images  and  for 
the  more  robust  case  of  2-D  image  registration.  The  effects  of  focal-length  errors  on 
the  CRLB  and  the  Barankin  bound  for  the  images  were  also  examined.  The  results 
obtained  show  that  small  image  size,  low  illumination  intensity,  and  focal-length 
errors  increase  the  relevance  of  the  Barankin  bound  to  registration  estimates. 

In  the  following  discussion  and  graphs,  8-bit  gray-scale  images  are  employed 
and  the  degree  of  noise  in  an  image  is  measured  using  PSNR  where  the  PSNR  of  an 
S  x  T  sized  image  is  defined  as: 


PSNR 


10  log 


(4.25) 


where  I  is  the  database  image,  I  is  the  corrupted  version  of  the  image,  and  ||...||f  is 
the  Frobenius  norm  which  is  defined  as  the  square  root  of  the  sum  of  the  squares  of 
the  elements  of  a  matrix.  For  experimental  LIDAR  images,  where  images  intensities 
are  measured  by  photon  counts,  the  maximum  value  of  the  average  frame  is  used  in 
place  of  255  in  the  numerator  of  (4.25). 


4-3.1  Registration  Performance  for  Standard  Pentagon  Image.  To  simulate 
low-SNR  conditions  that  might  be  present  in  night-time  or  passive  infrared  (PIR) 
filtering,  the  intensity  of  the  image  shown  in  Figure  4.1  was  divided  by  4  and  then 
corrupted  with  AWGN.  For  the  original  images,  this  produced  pixel  values  in  the 
range  [15,60]  where  the  maximum  possible  pixel  value  was  255. 
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Figure  4.1:  1024  x  1024  Pentagon  im¬ 

age  from  http://sipi.usc.edu/database/. 


Figure  4.2:  Bound  on  the  variance 

of  estimates  of  the  x-shift  for  the  im¬ 
age  shown  in  Figure  4.1  using  both  pro¬ 
jections  and  2-D  registration.  Bounds 
for  2-D  registration  are  lower  than  those 
derived  using  projections.  To  simulate 
low-light  conditions,  the  pixel  values  of 
the  source  image  are  divided  by  a  factor 
of  4. 


f. 3. 1.1  Performance  of  Projection  and  Two-Dimensional  Registration. 

The  Barankin  bound  and  the  CRLB  were  first  calculated  for  the  full  image  and 
for  a  128  x  128  subsection  of  the  image  shown  in  Figure  4.3,  without  any  defocus 
errors  (i.e.  H0  =  J).  For  both  the  1024  x  1024  image,  and  the  128  x  128  subsection, 
the  CRLB  and  the  Barankin  bound  of  2-D  horizonal  shift  estimates  were  the  same, 
for  practical  purposes.  With  less  information,  it  was  expected  that  the  Barankin 
bound  would  be  more  pronounced  for  the  smaller  image  of  Figure  4.3  and  for  the 
projection  bounds  in  general.  This  was,  in  fact,  the  case  as  shown  by  examining  and 
comparing  the  results  of  the  analytical  bound  calculation  shown  in  Figures  4.2  and 
4.4.  In  both  cases,  the  bound  on  estimates  using  projections  is  higher  than  those  on 
2-D  estimates;  however,  the  breakpoint  of  the  Barankin  bound  occurs  at  the  highest 
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Figure  4.3:  128  x  128  subsection  of 

Figure  4.1. 


Figure  4.4:  Calculated  performance 

bounds  for  registration  using  projec¬ 
tions  and  2-D  registration  for  the  128 
x  128  subsection  of  the  Pentagon  im¬ 
age  in  Figure  4.3  where  the  intensities 
of  the  source  images  have  been  divided 
by  a  factor  of  4. 


SNR  in  the  case  of  estimates  for  Figure  4.3.  In  general,  the  bounds  on  registration 
using  two-dimensional  correlation  were  much  lower  than  the  bounds  on  registration 
using  1-D  projections  and  the  deviation  of  the  Barankin  bound  from  the  CRLB  is 
much  less  pronounced  in  2-D  filtering. 


4-  3. 1.2  Bounds  on  Registration  in  the  Presence  of  Defocus  Errors. 
Bounds  on  registration  estimates  were  then  calculated  using  the  images  of  Figures  4.2 
and  4.4.  This  required  making  some  basic  assumptions  about  the  optical  system  un¬ 
der  study.  The  calculated  bounds  for  the  optical  filtering  case  were  performed  using 
an  optical  model  based  on  specifications  from  a  Celestron  14”  (356  mm)  Schmidt- 
Cassegrain  telescope  operated  at  a  range-to-objective  of  20  km.  This  telescope  has 
a  focal  length  of  approximately  four  meters.  Calculations  were  performed  for  light 
with  a  wavelength  of  0.5  jam.  Measurements  were  then  simulated  of  a  diffraction- 
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Figure  4.5:  Image  shown  in  Figure  4.1  Figure  4.6:  Bounds  for  the  image 

with  simulated  0.7A  defocus  error.  shown  in  Figure  4.1  with  simulated  0, 

0.1A,  0.3A,  and  0.7A  defocus  errors. 
Note  that  the  difference  between  the 
bounds  increases  with  increasing  defo¬ 
cus. 

limited  image  with  focal-length  errors  of  0  to  0.7  wavelengths  in  increments  of  0.1 
wavelengths.  Representative  defocused  images  are  shown  in  Figures  4.5  and  4.7. 

As  shown  in  Figures  4.6  and  4.8,  the  bounds  on  shift  estimates  for  both  images 
increased  with  increasing  defocus  error.  Similarly,  the  Barankin  bound  became  in¬ 
creasingly  relevant  with  increased  defocus  errors  -  especially  in  the  case  of  the  smaller 
images.  As  before,  the  smaller  image  had  higher  overall  bounds  due  to  decreased 
information  content. 


4-3.2  Registration  Performance  of  Actual  LIDAR  data.  Using  insights 
gained  from  the  examination  of  standard  test  images,  a  series  of  50  frames  of  LI¬ 
DAR  data  captured  using  techniques  and  equipment  described  in  [36]  was  examined. 
These  individual  frames  were  median  filtered  to  remove  specular  returns  and  spa¬ 
tially  registered  using  a  two-dimensional  cross-correlation.  Then  after  filtering  and 
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Figure  4.7:  Image  shown  in  Figure  4.3  Figure  4.8:  Bounds  for  the  image 

with  simulated  0.7A  defocus  error.  shown  in  Figure  4.3  with  simulated 

0,  0.1A,  0.3A,  and  0.7A  defocus  er¬ 
rors.  Note  that  smaller  image  results  in 
bounds  higher  than  those  displayed  Fig¬ 
ure  4.6  and  that  the  difference  between 
the  bounds  increases  with  increasing  de¬ 
focus. 

registration,  the  frames  were  averaged  to  create  a  representative  256  x  256  diffraction- 
limited  image  which  was  considered  to  be  “truth”  data.  This  resulting  image  is  shown 
in  Figure  4.9  and  a  representative  frame  of  data  is  shown  in  Fig.  4.10.  Of  particular 
interest  was  the  region  of  interest  shown  in  Figure  4.11. 

Using  the  frame  average  in  conjunction  with  registration  estimates  of  the  in¬ 
dividual  frames,  PSNRs  were  calculated  for  each  of  the  50  frames  of  data.  The 
calculated  PSNRs  for  the  frames  ranged  from  25.7  dB  to  27.7  dB  with  a  mean  dB 
value  of  27.07  dB. 

The  frame  average  was  also  used  to  calculate  theoretical  bounds  on  the  MSE 
of  registration  estimates  for  the  frames  in  the  ensemble.  Bounds  on  estimates  of 
column  shifts  of  the  frames  using  both  projections  and  2-D  estimates  for  the  full 
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Projection-Based  Registration 

2-D  Registration 

Est  PSNR 

Cramer-Rao 

Barankin 

Cramer-Rao 

Barankin 

Full  Image 

26.3  dB 

0.153 

0.153 

0.0089 

0.0089 

Region  of  Support 

24.6  dB 

0.790 

0.953 

0.0471 

0.0471 

Table  4.1:  Calculated  registration  bounds  for  full  frame  and  region  of  interest  of 
LIDAR  data  in  units  of  pixels2. 


image  are  shown  graphically  in  Figure  4.13.  The  threshold  for  the  Barankin  bound 
(the  PSNR  below  which  the  bound  diverges  from  the  CRLB)  is  shown  to  occur 
at  an  approximate  PSNR  of  23.5.  The  lowest  PSNR  was  25.7  dB  which  lies  in 
a  region  where  the  CRLB  and  the  Barankin  bound  are  coincident.  Thus,  for  the 
full  frames,  the  CRLB  is  an  adequate  measure  of  the  bound  on  projection-based 
registration.  Bounds  were  also  calculated  for  registration  estimates  performed  using 
2-D  shift  estimation  algorithms.  As  expected,  bounds  on  variances  on  these  estimates 
were  significantly  lower  than  those  calculated  using  projection-based  methods.  2-D 
bounds  are  also  shown  graphically  in  Figure  4.13.  A  comparison  of  the  CRLB  and 
Barankin  bounds  for  1-D  and  2-D  image  registration  given  the  calculated  PSNRs  of 
the  LIDAR  data  is  shown  in  Table  4.1. 

For  automatic  target  recognition  problems,  it  is  often  necessary  to  identify 
and  estimate  motion  in  a  specific  target  among  background  clutter.  For  these  ap¬ 
plications,  the  ability  to  estimate  motion  of  an  object  is  also  theoretically  bounded. 
Examination  of  these  bounds  began  by  selecting  a  subregion  of  interest  within  the 
image  shown  in  Figure  4.9.  The  target  for  this  experiment  was  a  68  x  168  region  of 
interest  shown  in  Figure  4.11.  An  estimate  of  the  motion  of  the  tank  between  consec¬ 
utive  frames  depends  on  the  ability  to  register  the  regions  of  interest  in  consecutive 
frames.  As  shown  in  Figure  4.14,  the  bounds  on  registration  with  projections  are 
significantly  higher  than  with  2-D  registration  techniques.  It  is  also  interesting  to 
note  that  the  Barankin  bound  for  registration  using  projections  is  approximately 
30%  higher  than  the  CRLB  within  the  range  of  PSNRs  encountered  in  the  data. 


Figure  4.9:  256  x  256  image  resulting 

from  median  filtering,  and  averaging  50 
frames  of  LIDAR  data  captured  at  10 
km  from  the  target.  “Truth”  Data  (ap¬ 
prox.) 


Figure  4.11:  68  x  168  region  of  inter¬ 

est  within  the  image  of  Figure  4.9. 


Figure  4.10:  Representative  LIDAR 

frame  prior  to  filtering  and  averaging 
(PSNR  =  26.3). 


Figure  4.12:  Representative  region  of 
interest  in  a  LIDAR  frame  prior  to  fil¬ 
tering  and  averaging  (PSNR  =  24.6). 


4-4  Chapter  Summary 

This  chapter  provided  a  calculation  and  comparison  of  theoretical  performance 
bounds  for  image  registration  algorithms.  It  showed  that  for  large  images  under 
conditions  of  full-frame  registration,  the  CRLB  is  an  adequate  measure  of  perfor¬ 
mance  for  most  realistic  imaging  conditions.  For  projected,  small  images,  or  image 
corrupted  by  focal-length  errors,  however,  the  CRLB  may  not  sufficiently  predict 
bounds  on  the  performance  of  a  registration  algorithm  and  the  Barankin  bound 
provides  a  more  accurate  estimate. 
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Figure  4.13:  Bounds  on  registration 

using  projections  and  2-D  registration 
of  the  LIDAR  frame  shown  in  Fig¬ 
ure  4.9.  The  CRLB  and  Barankin 
bound  for  the  2-D  registration  case  are 
indistinguishable. 


Figure  4.14:  Bounds  on  registration 

using  projections  of  the  LIDAR  frame 
region  of  interest  shown  in  Figure  4.11. 
For  projection-based  registration,  the 
breakpoint  of  the  Barankin  bound  is  ap¬ 
proximately  29.0  which  is  well  above  the 
PSNRs  of  the  region  of  interest.  In  the 
2-D  registration,  the  breakpoint  of  the 
Barankin  bound  falls  far  below  the  aver¬ 
age  PSNRs  for  the  data.  This  indicates 
that  the  CRLB  is  an  adequate  bound 
for  2-D  but  not  1-D  registration  of  this 
data  set. 
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This  chapter  also  showed  that  understanding  of  registration  bounds  is  extended 
by  calculating  the  Barankin  bound.  Calculations  for  the  Barankin  bound  were  largely 
numeric  and  were  based  on  the  most  probable  registration  errors.  For  the  images 
examined,  image  registration  using  projections  increased  both  the  CRLB  and  the 
Barakin  bound  as  compared  to  bounds  derived  using  2-D  registration  algorithms. 
However,  it  is  also  worth  noting  that  Cain  et  al.  [11]  show  that  shift  estimates  for 
low  intensity  images  in  the  presence  of  fixed  pattern  noise  may  actually  be  better 
using  projections  than  full  2-D  estimates.  Bounds  on  this  behavior  would  also  make 
an  interesting  future  study. 

It  was  also  shown  that  for  the  test  images,  the  CRLB  and  Barankin  bounds 
increased  as  the  severity  of  the  defocus  errors  increased.  With  this  increasing  defocus, 
it  was  demonstrated  that  the  Barankin  bound  became  more  pronounced  and  more 
applicable  to  images  with  higher  SNRs. 

Perhaps  the  most  interesting  aspect  of  the  research  documented  in  this  chapter 
was  that  the  bounds  under  study  were  most  applicable  to  distortions  of  small  images. 
In  many  target-recognition  applications,  objects  being  imaged  may  be  rotated  and 
dilated  and  salient  features  may  be  extracted  using  various  filtering  techniques.  This 
observation  suggests  many  follow-on  applications.  For  instance,  the  Barankin  bound 
may  be  of  increasing  importance  to  applications  where  images  are  aliased  or  in  the 
differentiation  of  multiple  similar  targets.  Another  interesting  extension  would  be 
the  calculation  of  bounds  on  identification  of  objects  and  features  with  contrasting 
colors  under  low  light  conditions.  Other  extensions  to  this  research  will  be  discussed 
in  Chapter  VI  of  this  dissertation. 
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V.  Block-based  Methods  for  Denoising  Images 


Block-based  denoising  algorithms  approach  single-frame  denoising  using  tech¬ 
niques  that  are  similar  to  the  multiframe  averaging  techniques  facilitated  by 
image  registration.  This  chapter  introduces  several  new  block-based  denoising  algo¬ 
rithms  that  produce  impressive  results,  especially  in  low-SNR  scenarios. 

The  methods  that  are  presented  in  this  chapter  begin  by  thresholding  the 
variance  of  individual  blocks  to  identify  areas  that  are  effectively  handled  by  standard 
image  processing  techniques.  Like  the  method  proposed  by  Kervrann  and  Boulanger 
[31],  the  methods  described  here  use  the  Euclidian  distance  between  blocks  and 
develop  a  threshold  based  on  the  chi-square  distribution  to  identify  matching  blocks. 
Unlike  the  method  proposed  by  Kervarann  and  Boulanger,  the  methods  used  in  this 
chapter  use  fast  approximations  for  determining  these  thresholds,  examine  block 
correlations  and  higher  order  statistics  of  the  error  function  to  match  blocks  with 
similar  content,  and  rely  on  a  simple  binary  weighting  scheme  to  combine  blocks 
in  a  way  that  produces  a  denoised  estimate  of  a  region  of  interest.  These  new 
methods  also  improve  on  low-SNR  performance  of  other  methods  by  suspending  the 
requirement  for  comparative  blocks  to  be  spatially  close.  This  allows  combining  data 
from  across  an  entire  image  and,  in  fact,  could  facilitate  combining  data  from  entirely 
different  image  frames  from  the  same  sensor. 

5.1  The  Gaussian  Detection  Denoising  Method 

Using  the  mathematical  background  described  in  Appendix  B,  the  NLM  al¬ 
gorithm  was  modified  to  improve  its  performance.  This  section  describes  how  the 
new  algorithm  was  formulated  and  implemented  to  improves  the  performance  of  the 
NLM  algorithm.  The  method  exploits  redundancy  in  the  image  and  improves  on 
both  the  theoretical  foundation  and  the  output  of  the  NLM  algorithm.  This  new 
algorithm  which  will  heretofore  be  known  as  the  “Gaussian  Detection  Denoising” 
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or  GDD  method.  The  algorithm  is  briefly  described  below  and  will  be  explained  in 
detail  later  in  this  section. 

5.1.1  Overview  of  the  GDD  Denoising  Method. 

1.  Begin  with  an  image  of  size  S  x  T. 

2.  Select  an  N  x  IV-sized  subimage  centered  at  (i,j)  within  the  image  to  denoise. 
Call  this  block  F ij. 

3.  For  every  pixel  centered  at  (s,  t)  in  the  parent  image,  define  a  neighborhood 
around  the  pixel  and  call  this  neighborhood  the  block  GS)t. 

4.  Subtract  the  means  from  FtJ  and  all  Gsj  blocks. 

5.  For  all  GS)t  calculate  the  scalar  A (s,t)  that  minimizes  the  mean-squared  er¬ 
ror  between  F t  j  and  A(s,t)Gsj. 

6.  For  all  GS:t  calculate  the  mean  square  of  Fjj  —  A(s,t)GS;t.  Call  this  value 
M SE(s,  t). 

7.  Examine  the  distribution  of  MSE(s,t)  for  all  s  and  t.  Determine  if  it  has 
a  Gaussian  distribution  or  if  it  is  possible  to  detect  a  subset  of  MSEs  that 
naturally  form  a  Gaussian  distribution. 

8.  If  no  Gaussian  exists,  use  the  original  value  of  F,j  (not  just  the  center  pixel) 
as  the  denoised  value. 

9.  If  a  Gaussian  does  exist,  average  the  values  of  Fjj  and  the  GSj*  blocks  that 
form  the  Gaussian.  Call  this  averaged  block  the  new  value  for  F itj. 

10.  Restore  the  mean  to  the  block  F^.  This  is  the  denoised  value  of  F,;j. 

11.  Repeat  steps  2-10  for  other  blocks  in  the  image  as  desired. 

12.  Recombine  the  individual  blocks  Fjj. 
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5.1.2  GDD  Preliminary  Assumptions  and  Calculations.  To  consider  the 
algorithm  in  more  detail,  suppose  that  there  exists  an  image  D,  a  region  of  interest 
defined  F;J-,  and  subimages  used  for  comparison  defined  as  GS)t  as  in  Section  2.3. 
Furthermore,  call  the  index  of  each  pixel  in  these  subimages  (u,  v )  where  u  E  U  = 
{1, ...,  IN'}  and  v  E  V  =  (1, ...,  l\f}.  To  remove  illumination  and  reflectance  differences, 
the  means  are  removed  from  both  the  Fjj  and  GSjt ■  That  is,  for  all  values  of  F  and 
G  indexed  by  x  and  y: 

F ij  (x,y)  =  F u (x.y)  -  ^  ^  FUj  (u,v),  (5.1) 

uGli,vGV 

and 


Gs,t(x,y)  =  GStt(x,y)  -  ^  ^2  G Sit(u,v)  V  GS)t  E  D.  (5.2) 

uGll,vGV 

Then  define 

A.(s,t)G s,t 

GDD(F.j)  =  - •  (5.3) 

£  A (s,t) 

seS.teT 

It  is  desirable  to  use  weights  that  are  not  necessarily  dependent  on  a  priori  knowledge 
of  the  noise.  Instead  of  attempting  to  calculate  a  unique  weight  for  each  Gs,t,  two 
hypotheses  are  posed: 


H1  : 
H° : 


F  itj  Gsrf 
F  ij  Gs,t 


2 

F 

2 

F 


|Qi,j  Qs, 


ill F  i 


9^  II Qu  Qs, 


i  II F 


(5.4) 

(5.5) 


In  (5.4)  and  (5.5),  Q ^  and  QS;t  are  realizations  of  the  noise  in  G S)t  and  Fjj.  In  other 
words,  on  average,  H 1  corresponds  to  the  case  when  Gs,t  and  F t  J  are  approximately 
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identical  but  differ  by  AWGN  and  H°  corresponds  to  the  case  when  the  difference 
between  Gs,t  and  Fjj  is  greater  than  that  attributable  to  noise. 

The  problem  with  using  these  hypotheses  is  that  the  only  available  data  is 

on  the  left  hand  sides  of  (5.4)  and  (5.5)  and  consequently,  it  is  not  possible  to 

develop  a  traditional  likelihood  ratio  test.  However,  using  the  background  provided 

in  Appendix  B,  it  is  possible  to  develop  an  alternate  test  which  allows  a  rough 

differentiation  between  these  two  cases.  As  discussed  in  Appendix  B,  if  there  exist 

2  2 

a  G Sjt  such  that  F?;j  —  GSjt  ~  ||Qj,i  —  Qs,t||F,  then  the  distribution  of  the  MSEs 

F 

„  „  2 

of  F ij  —  GSjt  will  be  a  noncentral  chi-square  distribution.  Without  a  priori 

F 

knowledge  of  the  distribution  of  the  noise  it  is  not  possible  to  estimate  A;  however,  v 
corresponds  to  the  number  of  pixels  in  F ^  and  GSi*.  As  u  increases  towards  inhnity, 
the  shape  of  the  distribution  x'vW  becomes  Gaussian  as  demonstrated  graphically 
in  Figure  5.1.  Therefore,  if  v  is  chosen  to  be  sufficiently  large,  it  is  expected  that 
when  Fjj  and  GStt  are  approximately  equal  except  for  additive  noise,  the  distribution 
of  MSEs  between  F?  J-  and  GS)t  will  be  Gaussian.  Therefore,  the  two  hypotheses  are 
reposed  as  follows: 


//'  :  Fi  j  -  Gs  t  2 

F 


H°  :  else. 


2 

forms  a  Gaussian  distribution 

F 


(5.7) 


As  described  in  more  detail  in  Section  5.1.3,  the  GDD  algorithm  will  determines  this 
hypothesis  by  iteratively  setting  a  threshold  within  the  MSEs  of  each  block  F,  j  and 
testing  to  see  whether  the  distribution  of  MSEs  below  this  threshold  is  Gaussian. 
When  G Stt  =  F^,  it  also  assumes  H] .  Using  the  two  hypotheses,  it  is  then  possible 
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to  assign  a  binary  weight  to  A (s,t)  such  that 

H 1  =>•  A (s,t)  =  1, 

H°  =>  A (s,t)  =  0. 

5.1.3  Observed  Distribution  of  Mean  Squared  Errors.  For  a  given  image 
D,  examining  the  histogram  of  the  MSEs  between  a  single  F,j  and  all  other  Gs>t, 
yields  something  that  may  be  similar  in  appearance  to  Figure  5.2.  As  noted,  when 
F t  j  and  Gsj  are  sufficiently  large,  if  there  is  sufficient  redundancy  in  the  image,  the 
distribution  of  the  MSEs  will  be  a  combination  of  two  distributions,  one  of  which 
is  a  x^(A)  distribution  that  approximates  a  Gaussian.  This  is  demonstrated  in 
Figure  5.1.  If  he  variance  of  the  noise  is  known,  the  A  and  the  location  of  E[xl( A)] 
can  be  estimated.  In  the  absence  of  this  knowledge,  it  may  be  possible  to  locate 
a  Gaussian  distribution  in  the  distribution  of  the  Mean  Squared  Errors  by  scaling 
the  MSEs  and  using  an  iterative  process  to  detect  a  Gaussian  distribution.  The 
subimages,  GS)t,  with  MSEs  that  lie  in  this  Gaussian  distribution  are  considered  as 
satisfying  H1. 

5. 1.3.1  Detection  of  H1  and  H° .  To  perform  this  separation,  first 
find  a  constant  A (s,t)  for  each  GS)*  that  minimizes  the  MSE  between  F,:j  and 
A(s,t)GS:t-  This  can  be  accomplished  by  minimizing  the  quantity 

MSE  —  1 1 F —  A(s,  f)GS;t|^  . 

When  this  equation  is  expanded  out,  it  yields  the  following  where  the  notation  (F,  G) 
indicates  an  inner  product  and  K{}  indicates  taking  the  real  part  of  a  complex 
number 

MSE=  jL  ^||FiJ||^-2A(s,t)3«{(Fi,i,GSit)}  +  A(s,  f)2  ||gm|Q  . 
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Figure  5.1:  Graph  of  the  PDF  X225 C^)  over  0  <  x  <  500  which 
occurs  for  N  —  15. 

For  intensity  images  (i.e.  I  =  9?{I}),  setting  the  first  derivative  of  this  equation 
equal  to  zero  and  solving  for  A (s,t)  produces 


A  (s,t) 


F  id,G 


(5.8) 


If  the  histogram  resulting  from  the  calculation  of  MSE  is  examined  for  all  values 
of  A (s,t)  and  GSit,  a  distribution  is  arrived  at  that  appears  similar  to  Figure  5.3 


which  has  an  upper  bound  of 


F 

■*- 1,3 


/N2.  Subimages  Gs,t  that  are  substantially 


different  from  will  tend  to  be  minimized  by  A (s,t)  and  will  have  MSEs  that  fall 
close  to  the  upper  bound  of  the  histogram.  Subimages  that  satisfy  Hl .  if  they  exist, 
will  be  found  in  the  tail  of  the  distribution.  If  the  variance  of  the  noise  is  known, 
one  can  predict  a  range  for  this  Gaussian,  but  in  the  absence  of  this  knowledge  it 
is  necessary  to  rely  on  an  iterative  process  to  eliminate  GSjt  that  do  not  form  a 
Gaussian  distribution. 
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Figure  5.2:  Histogram  of  the  Mean  Squared  Error  between 

a  representative  FtJ  and  all  Gs,o  No  scaling  factor  has  been 
applied  to  GS:t- 


Beginning  at  the  upper  bound  of  the  histogram  of  the  MSEs,  test  the  dis¬ 
tribution  of  the  MSEs  that  are  less  than  7  using  the  Lillicfors  test  for  normality, 
described  in  [18].  If  the  distribution  is  Gaussian,  assign  A(.s,f)  =  1  for  all  G Sit. 


If  not,  set  (A(s,t)  =  0  |  7  <  Fjj  —  A(s,  t)Gs>t 


2 

A 

< 

F 

F-  • 

1,3 

}  and  {A(s,  t)  = 


Fid  -  A(s,t)Gs>t  <  7}  and  reapply  the  Lilliefors  test.  Then,  select  7  based 

F 

on  the  amount  of  processing  time  desired.  This  step  is  iterated  for  decreasing  values 
of  7  until  either  satisfied  the  Lilliefors  test  is  satisfied  for  some  number  of  GS)t  or  all 
Gm  have  been  eliminated  as  potential  matches  for  Fjj.  In  this  way,  for  each  F,  7-,  a 
value  for  GDD(Fjj)  =  /  A(s,f)GsG  is  determined. 


5. 1.3.2  Image  Restoration.  To  complete  the  algorithm  for  a  single 
subimage,  the  block  mean  is  restored  by  calculating 

^i,j  =  GDD(Fjj)  +  Fj j(u,v). 

u£U,v£V 
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It  is  then  necessary  to  restore  the  image  from  over  all  (i,j).  One  simple  way  to 
approach  this  is  to  use  the  restoration  approach  of  the  NLM  and  say 

I  (i,j)  =  F  ij  (n  +l,n  +  l). 

The  problem  with  this  approach  in  the  GDD  is  that  the  original  calculation  of  F,j 
involved  subtracting  the  block  mean.  As  discussed  in  Appendix  B,  the  sample  mean 
for  a  block  of  zero  mean  noise  is  subject  to  some  variation.  Consequently,  some 
portion  of  Fjj  is  still  attributable  to  noise.  Experimentally  it  was  determined  that 
it  is  advantageous  to  restore  the  image  by  summing  and  then  averaging  overlapping 
F ij  across  the  entire  image. 

It  is  also  possible,  and  computationally  advantageous,  to  obtain  good  results 
by  denoising  a  subset  of  all  possible  Fjj  and  combining  these  to  form  a  single  im¬ 
age.  For  example,  in  a  150x150  sized  image,  it  would  be  possible  to  denoise  100 
non-overlapping  15x15  blocks  and  recombine  them  to  form  a  single  image.  The 
disadvantage  to  this  approach  is  that  the  recombined  images  have  undesirable  dis¬ 
continuities  at  the  edges  of  the  block  due  to  an  uneven  restoration  of  the  means. 
These  discontinuities  can  be  mitigated  by  choosing  a  latticed  and  overlapping  set  of 
F tJ  and  averaging  their  denoised  values  together.  This  averaging  makes  the  disconti¬ 
nuities  between  blocks  less  detectable  and  less  objectionable  as  shown  in  Figure  5.10. 

5.1.4  Experimental  Results  with  the  GDD  Algorithm.  The  algorithm  was 
used  on  simulated  data  and  the  results  were  compared  with  results  obtained  using 
the  NLM  algorithm.  The  date  used  for  the  test  was  a  Light-Radar  (LIDAR)  image 
of  a  truck-mounted  resolution  board  as  a  truth  image  (Figure  5.4)  which  was  then 
corrupted  with  AWGN  with  a  =  25.  The  resulting  noisy  image  is  shown  in  Figure  5.5, 
and  the  histogram  of  the  noisy  image  is  shown  in  Figure  5.6.  A  15  x  15  subimage 
was  used  for  F,j  to  obtain  the  results  shown  in  (Figure  5.7)  for  the  NLM  algorithm 
and  (Figure  5.8)  for  the  GDD  method. 
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Figure  5.3:  Histogram  resulting  from  the  calculation  of 

2 

Fjj  —  A(s,  t)Gs<t  for  a  single  F4J-  and  all  values  of 

F 

A(s,t)GStt-  Note  the  difference  from  Figure  5.2  caused  by  ap¬ 
plying  the  scaling  factor  A (s,t). 


For  the  test  image  employed,  the  SNR  provides  a  fair  comparison  of  these  two 
techniques.  Annotating  the  truth  image  as  I,  the  mean  of  the  truth  image  as  I.  the 
image  being  denoised  as  D,  and  the  output  of  the  denoising  algorithm  as  I,  the  SNR 
of  the  noisy  image  is  calculated  as 


I  —  If 

SNR=  11 


II  —  D 


2  • 


The  SNR  of  the  denoised  image  is  calculated  as 


I  -I 

SNR  =  {} - ]]f 

I-! 


The  SNR  of  the  original  image  after  corruption  with  noise  of  cr  =  25  was  20.92.  Using 
the  NLM  algorithm,  an  SNR  of  27.97  was  achieved.  Using  the  proposed  method, 
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Figure  5.4:  Truth  image  used  to  generate  the  simulation  data. 

The  bar  to  the  right  of  the  image  indicates  pixel  values. 

an  SNR  of  41.77  was  achieved  which  represented  a  49%  improvement  over  the  NLM 
algorithm. 

Image  denoising  was  also  attempted  using  alternative  strategies  in  an  attempt 
to  minimize  the  computational  complexity  by  minimizing  the  number  of  F t  j  used  in 
the  algorithm.  A  first  attempt  was  made  to  select  F,  j  that  were  mutually  exclusive. 
For  an  a  101  x  101  image  with  zero  padding,  it  was  possible  to  denoise  the  image 
using  49  versions  of  F ij  and  10201  versions  of  GSjt-  The  results  of  this  denoising 
are  shown  in  Figure  5.9.  Although  the  SNR  in  this  case  was  34.24,  the  image 
contained  discontinuities  resulting  from  the  mean  restoration  process.  In  an  attempt 
to  minimize  these  discontinuities,  an  overlapping  lattice  of  98  Fjj  was  used  with 
10201  GS)t.  The  results  were  then  locally  averaged  together  to  form  the  image  in  in 
Figure  5.10  which  has  an  SNR  of  43.81. 
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Figure  5.5:  Truth  image  with  AWGN  of  a  =  25  added. 

5.1.5  Conclusions  Drawn  from  Initial  Results.  This  section  introduced  a 
unique  image  processing  algorithm.  Although  it  improved  over  results  obtained  using 
the  NLM  algorithm,  it  more  importantly  provided  additional  research  opportunities. 
Notably,  the  chi-square  distribution  becomes  more  pronounced  with  fewer  degrees 
of  freedom  and  the  probability  of  statistically  similar  blocks  is  expected  to  increase 
for  smaller  block  sizes.  By  decreasing  the  neighborhood  to  some  optimal  size,  it  is 
expected  that  the  performance  of  the  algorithm  can  be  improved.  In  addition,  the 
results  reflected  here  do  not  account  for  the  averaging  of  permutations  (e.g.  rota¬ 
tions  and  translations)  of  GSjf  blocks  which  may  provide  an  additional  performance 
improvement. 
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Figure  5.6:  Histogram  of  the  noisy  image. 

This  implementation  also  assumed  no  a  priori  knowledge  of  the  noise  distri¬ 
bution;  however,  in  many  applications  the  distribution  can  be  obtained  through 
measurements  of  output  from  a  given  imaging  system. 

One  significant  problem  with  this  method  was  that  it  was  computationally 
expensive  and  processing  times  were  significant  for  even  small  images.  A  second 
problem  is  that  the  results,  while  good,  are  less  impressive  than  other  state-of-the- 
art  denoising  methods.  Consequently,  other  algorithms  were  developed. 

5.2  The  HOD  and  XCD  Denoising  Algorithms 

This  section  describes  two  additional  novel  methods  for  denoising  images.  The 
algorithms  operate  by  identifying  regions  of  interest  within  a  noise-corrupted  image 
and  then  creating  noise  free  estimates  of  the  regions  as  averages  of  similar  regions 
in  the  image.  These  similar  regions  are  found  by  comparing  examining  the  statistics 
of  the  error  functions  between  the  given  region  and  other,  identically  sized  regions 
in  either  the  same  image  or  in  other  images  from  the  same  sensor.  The  statistically 
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Figure  5.7:  Output  obtained  using  the  NLM  algorithm.  SNR 

for  this  image  was  27.97. 


similar  regions  are  averaged  together  to  produce  an  estimate  of  the  noise-free  version 
of  the  region  of  interest.  This  technique  is  similar  to  multiframe  averaging;  however, 
only  a  single  frame  is  required.  The  techniques  are  shown  to  outperform  Wiener 
and  median  filtering  over  a  wide  range  of  noise  conditions  but  are  most  effective  in 
images  with  very  low  signal-to-noise  ratios. 

Section  5.2.1  describes  a  denoising  method  that  denoises  images  using  the  first, 
second  and  third  moments  of  regions  within  the  image.  Then,  Section  5.2.2  describes 
a  denoising  method  that  uses  the  first  and  second  moments  of  the  data  in  concert 
with  fast  projection-based  cross-correlations.  Section  5.2.3  describes  the  algorithms 
and  performance  in  comparison  to  Wiener  filtering  [26]  and  median  filtering  [26] 
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Figure  5.8:  Output  obtained  using  the  GDD  method.  SNR 

for  this  image  was  41.77. 

when  used  on  standard  benchmark  images  as  well  as  on  actual  LIDAR  data  with  a 
simulated  noise  component. 

5.2.1  Higher- Order  Statistics  Method  for  Block  Matching.  This  section 
discusses  using  a  method  that  employs  the  variances  of  the  blocks  and  the  skewness 
of  their  error  functions  as  measures  of  block  similarity.  This  algorithm  is  refered 
to  as  the  Higher-Order  Denoising  (HOD)  algorithm.  This  algorithm  looks  at  the 
second  moment  of  the  block  and  pairs  of  blocks  whose  error  function  has  a  third- 
order  moment  indicative  of  a  Gaussian  distribution.  A  summary  overview  of  the 
algorithm  is  provided  followed  by  the  details  of  the  algorithm. 
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Figure  5.9:  Output  obtained  by  GDD  denoising  using  49  of 
10201  possible  15  x  15  blocks.  Note  the  block  discontinuities 
that  are  similar  to  those  obtained  in  JPEG  restoration. 

5. 2. 1.1  Overview  of  the  Denoising  Algorithm  based  on  Higher  Order 

Statistics. 

1.  Within  an  image  D,  select  an  N  x  IV-sized  subimage  indexed  at  the  pixel  (i,j) 
within  the  image  to  denoise.  Call  this  block  F,;j. 

2.  Develop  estimates  for  the  maximum  variance  of  flat  regions  of  the  image,  the 
maximum  variance  of  the  error  between  two  featureless  blocks,  and  the  skew¬ 
ness  of  an  ensemble  of  noise  values. 

3.  For  every  pixel  centered  at  (s,  t)  in  the  parent  image,  define  an  N  x  N  neigh¬ 
borhood  around  the  pixel  and  call  this  block  Gsj. 
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Figure  5.10:  Output  obtained  by  GDD  denoising  using  an 

overlapping  lattice  of  98  of  10201  possible  15  x  15  blocks.  Dis¬ 
continuities  still  exist  but  are  less  objectionable  than  in  Fig¬ 
ure  5.9. 

4.  Subtract  the  means  from  all  F;J  and  all  GSjt. 

5.  For  each  block  F,  j,  evaluate  whether  the  variance  of  the  block  is  less  than  the 
upper  limit  of  the  variance  of  a  block  of  noise  only.  If  so,  set  the  entire  block 
to  the  block  mean  (i.e.  zero). 

6.  If  the  variance  of  the  block  is  above  the  threshold  used  in  5)  above,  calculate 
the  error  between  the  block  and  all  other  blocks  in  the  image.  If  the  vari¬ 
ance  and  skewness  of  the  error  function  between  F?J-  and  any  GSjt  are  within 
the  allowable  thresholds,  include  G S}t  in  an  average  of  blocks.  Calculate  the 
processed  block  as  the  average  of  F,:j-  and  all  identified  GS)t. 
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7.  Add  the  mean  of  the  original  block  to  the  processed  block.  This  is  the  restored 
block. 

8.  Repeat  steps  1-7  for  all  other  blocks  F,:j  in  the  image. 

9.  Reconstruct  the  image  as  an  average  of  all  the  overlapping  restored  blocks 
across  the  image. 

5. 2. 1.2  Description  of  the  Higher  Order  Statistical  Denoising  (HOD) 
Algorithm.  In  this  subsection  a  more  detailed  overview  of  the  image  model, 
underlying  assumptions,  and  the  mathematical  framework  of  the  HOD  algorithm  is 
provided.  The  image  model  assumes  that  the  predominate  noise  source  is  Additive 
White  Gaussian  Noise  (AWGN)  and  that  the  noise  is  independent  and  identically 
distributed  in  each  pixel.  Define  as  a  diffraction-limited,  N  x  N  block  of  an  image 
where  iVeN  and  N  >  1.  The  coordinate  pair  (i,j)  indicates  the  location  of  center 
pixel  of  the  neighborhood  Rj  within  a  larger  image  I  (i.e.  Rj  C  I).  If  I  is  corrupted 
by  zero-mean  Gaussian  noise  so  that  for  each  pixel,  v)  where  u  G  { 1 , .. . ,  iV } 

and  v  G  {1, ...,  N},  then 

D  ij(u,v)  =  Ii,j(u,v)  +  Q  ij(u,v), 

where  is  a  subimage  centered  at  (i,j)  and  is  the  realization  of  the  Gaussian 
noise  within  that  subimage.  In  matrix  notation,  this  can  be  denoted  as 


Within  an  image,  there  may  be  other  N  x  N  blocks  centered  at  coordinates  ( s,t ) 
that  satisfy  the  equation 


D 


S,t 


I ij  H“  Q s,£* 
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If  these  subimages  exist,  then  AS)t,  the  error  between  and  Ds  t,  can  be  calculated 


Ag^t  D jj  Ds,t 

(I/j  T  Q jj)  (Ijj  T  Qs,t); 

=  Qij  -  Qa,t.  (5.9) 

For  a  single  pixel  in  A, 

A s,t(u,v)  =  Q ij(u,v)  -  QSjt(u,  v). 

The  noise  characteristics  of  most  individual  sensors  can  be  determined  by  em¬ 
pirical  measurement  and  it  is  reasonable  to  assume  that  this  information  is  available 
to  the  algorithm.  In  the  algorithm  proposed  in  this  section,  it  is  assumed  that  the 
variance  of  the  predominant  noise  in  the  image  a  priori  is  known.  Given  the  vari¬ 
ance  of  the  noise,  cr,  the  statistical  characteristics  of  A  can  be  used  to  identify  and 
average  similar  blocks  in  a  given  image.  The  values  of  the  moments  of  Q  and  A 
can  be  viewed  as  random  variables  and  it  is  then  possible  to  select  matching  blocks 
based  on  the  values  of  these  moments.  For  example,  Oq,  the  variance  of  an  N  x  N, 
zero  mean,  block  of  noise  Q  can  be  calculated  as: 

<4  =  (Q  l),  (5.10) 

where  represents  squares  of  the  individual  elements  of  an  N  x  N  block  Q.  As¬ 
suming  that  the  noise  in  the  image  is  AWGN,  this  sum  of  terms  is  recognized  to  be 
a  chi-square  random  variable. 

Recalling  that  the  limiting  case  of  the  chi-square  distribution  as  N  tends  to 
infinity  is  a  Gaussian  distribution  [28] ,  the  PDF  of  the  variance  of  the  measured  noise 
in  a  block  Q  may  be  approximated  as  a  Gaussian  with  p  =  a2  and  Oq  =  2 a4/N2. 
Using  this  approach,  calculate  three  standard  deviations  from  /i  and  roughly  predict 
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the  upper  bound  on  the  PDF  of  the  variance  of  Q  as 


(5.11) 


A  similar  approach  can  be  used  to  estimate  the  upper  bound  of  the  PDF  of  the 
variance  of  A  which  is  denoted  a\  .  This  bound  can  be  calculated  as 


(5.12) 


More  precise  methods  for  determining  the  quartilcs  of  a  chi-square  distribution  are 
described  in  [28]  but  for  this  application  more  precise  methods  appear  unnecessary 
for  most  block  sizes. 

Estimation  of  the  skewness  for  a  block  of  noise  is  more  difficult  as  it  involves  the 
calculation  of  the  third  central  moment  of  an  ensemble  of  Gaussian  random  variables, 
however,  a  Monte  Carlo  simulation  was  employed  to  arrive  at  a  polynomial  function 
of  block  size  to  estimate  skewness.  Figure  5.11  shows  the  results  of  the  Monte  Carlo 
simulation  and  shows  the  measured  maximum  skewness  for  various  block  sizes  and 
noise  values.  This  maximum  skewness  is  independent  of  the  variance  of  the  noise. 
The  magnitude  of  the  bound  on  the  skewness  of  the  error  function  is  represented  as 
Smax,  where  the  skewness  of  an  ensemble  of  random  variables  Q  of  size  n  x  n  and 
with  mean  /jq  is  defined  as 


Q  (EEiE”=i(Q(W)-iiq)2)3'  l'  ' 

Once  these  limits  are  determined,  processing  of  the  image  can  begin.  Define  an 
image  D  of  size  S  xT  where  S  G  N  and  T  G  N.  Also,  take  an  N  x  A-sized  subimage 
F itj  C  D  where  F.y  is  centered  at  i  G  S  =  {1, ...,  /S'},  j  G  7  =  {1, ...,  T}  and  where 
{A  —  2n  +  1  |  n  G  N}.  If  the  image  D  is  zero  padded  by  n  in  all  directions, 
then  for  all  s  G  §  and  i  G  T,  there  are  N 2  —  1  other  subimages  in  D  which  may 
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Figure  5.11:  Plot  of  the  measured  maximum  skewness  vs.  N  -  the  square  root 

of  the  block  size.  The  plot  displays  the  mean,  maximum  and  minimum  measured 
values  of  maximum  skewness  for  various  block  sizes  of  AWGN  with  uq  =  5  to  150. 

be  similar  in  an  L2-norm  sense  to  F  and  one  subimage  where  s  =  i  and  t  =  j. 
These  subimages  are  denoted  GSi*.  Furthermore,  call  the  index  of  each  pixel  in  these 
subimages  (u,v)  where  u  E  U  ==  {1, ...,  Nj  and  v  E  V  =  {1, ...,  N}.  In  an  attempt  to 
remove  illumination  and  reflectance  differences,  the  means  are  removed  from  both 
the  Fjj  and  GSjt.  For  all  x  and  y : 

Ftj(x,y)  =  FlJ(x,y)  -  F hj(u,v)  (5.14) 

uEl l,v£V 


and 


Gs,t(x,y)  =  Ga,t{x,y)  -  GSit(u,  v)  V  s  e  S,  t  E  T. 


u£Vl,v£V 


(5.15) 


Most  natural  images  have  significant  low-frequency  content  that  can  be  de- 
noised  using  first-order  statistics.  Where  the  measured  variance  of  a  block  is  less 
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than  the  maximum  expected  variance  of  a  block  of  noise,  a  denoised  version  is  of 
the  block  is  estimated  by  replacing  all  the  pixels  of  the  block  with  the  block  mean. 
Specifically,  for  any  block  F,  j  that  has  a jj.  <  crr2nax ,  a  noiseless  version  of  each  pixel 
in  F ij  is  estimated  as 

HOD(FiJ)(«,u)  =  ^  Fh(bv)'  (5-16) 

uG  U,vGV 

Regions  that  have  significant  high  frequency  content  are  denoised  by  identifying 
and  averaging  a  subset  of  blocks  GS)t  that  have  similar  statistical  characteristics. 
This  set  of  blocks  with  similar  characteristics  is  represented  as  B  and  the  s,  t  pair 
corresponding  to  a  member  of  B  is  as  /3.  A  denoised  version  of  block  F tJ  is  then 
constructed  as 


HOD(Fjj)  —  — — ^Fjj  +  +  —  ^2  F  (5-17) 

I  I  ueu,veV 

where  the  last  term  is  the  mean  that  was  previously  subtracted  in  (5.14).  Given  the 
error  vector  between  two  blocks  AS)t  with  mean  HA3t,  the  members  of  B  are  those 
G  where 

Jp  Y,  (As,t{u,v)  -  ^Aj2  <  o-lmax,  (5.18) 

uGU,v£V 


and  where  the  skewness  of  the  error  function  is  evaluated  to  determine  if 


Y ^ueu,vev(AsAu,  v)  ~  Mm)3 
(E«eu,«ev(A«,t(w>u)  -  MAs,t)2)3 


(5.19) 


Using  this  approach,  all  F;J-  in  an  image  are  evaluated  against  all  Gs,«-  An  estimate 
of  the  noise-free  image  must  now  be  created  from  the  noise  free  estimates  of  the 
individual  blocks  in  the  image.  The  algorithm  concludes  by  reconstructing  the  image 
as  the  average  of  all  overlapping  blocks  HOD(Fjj). 
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5.2.2  Correlation- Based  Method  for  Block  Matching.  This  section  discusses 
using  a  method  that  employs  the  variances  of  the  blocks  and  their  error  functions  in 
conjunction  with  a  method  for  evaluating  the  similarity  of  the  blocks  based  on  their 
cross  correlation  peaks.  This  method  is  referred  to  as  the  cross  correlation  denoising 
(XCD)  algorithm.  The  algorithm  looks  at  the  second  moment  of  the  block  and 
evaluates  whether  comparative  blocks  are  spatially  correlated  with  the  block  under 
study.  In  an  effort  to  reduce  processing  time,  the  algorithm  replaces  the  calculation 
of  the  third  moment  with  a  projection-based  correlation  to  determine  whether  or  not 
the  peak  correlation  of  two  blocks  is  located  at  the  center  of  their  cross  correlation. 

5.2.2. 1  Overview  of  the  Cross  Correlation  Denoising  Algorithm. 
Most  steps  of  the  XCD  algorithm  are  identical  to  those  in  the  HOD  algorithm  de¬ 
scribed  in  section  5.2.1;  however,  instead  of  calculating  the  skewness  of  the  error 
vector  between  two  blocks  in  step  6,  the  algorithm  uses  the  projections  of  the  blocks 
to  calculate  their  cross  correlations  and  observe  the  location  of  the  cross  correlation 
peak.  The  processing  steps  of  the  XCD  algorithm  are: 

1.  Within  an  image  D,  select  an  N  x  N- sized  subimage  indexed  at  (i,j)  within 
the  image  to  denoise.  Call  this  block  F?j. 

2.  Develop  estimates  for  the  maximum  variance  of  flat  regions  of  the  image  and 
the  maximum  variance  of  the  error  between  two  featureless  blocks. 

3.  For  every  pixel  centered  at  (s,  t)  in  the  parent  image,  define  an  N  x  N  neigh¬ 
borhood  around  the  pixel  and  call  this  block  GSjt- 

4.  Subtract  the  means  from  all  the  blocks. 

5.  For  each  block,  evaluate  whether  the  variance  of  the  block  is  less  than  the 
upper  limit  of  the  variance  of  a  block  of  noise  only.  If  so,  set  the  entire  block 
to  the  block  mean. 
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6.  If  the  variance  of  the  block  is  above  the  threshold  used  above,  calculate  the 
error  between  the  block  and  all  other  blocks  in  the  image.  If  the  variance  of  the 
error  function  between  Fjj  and  any  Gsj  are  within  the  allowable  thresholds, 
calculate  the  cross  correlations  of  the  projections.  If  the  cross-correlation  peaks 
of  the  projections  are  in  their  centers,  include  G in  an  average  of  blocks. 
Calculate  the  processed  block  as  the  average  of  Fjj  and  all  identified  GStt. 

7.  Add  the  mean  of  the  original  block  to  the  processed  block.  This  is  the  restored 
block. 

8.  Repeat  step  1-7  for  all  other  blocks  in  the  image. 

9.  Reconstruct  the  image  as  an  average  of  all  the  restored  blocks  in  the  image. 

5. 2. 2. 2  Description  of  the  Correlation- Based  Denoising  (XCD)  Method 
for  Block  Matching.  An  alternative  method  of  block  selection,  which  is  also  based 
on  L2  distance,  is  also  effective.  In  many  cases,  blocks  within  an  image  are  most 
similar,  in  an  L2  sense,  to  shifted  versions  of  themselves.  Although  these  shifted 
blocks  are  close  in  L2  distance,  they  may  introduce  structurally  different  blocks 
into  a  block  average  thereby  biasing  the  result.  This  is  especially  true  along  edges 
of  image  features.  When  the  shift  is  in  the  direction  of  an  edge,  this  contributes 
constructively  to  a  block  averaging  algorithm.  When  the  shift  is  perpendicular  to 
an  edge  but  the  resulting  block  is  close  in  L2  distance,  it  has  the  effect  of  smoothing 
the  edges  in  a  denoised  block  and  thereby  reducing  high-frequency  image  content. 

Blocks  that  are  close  in  L2  distance  may  be  predicted  by  looking  at  the  au¬ 
tocorrelation  of  a  block  being  denoised.  The  primary  peak  of  the  autocorrelation 
corresponds  to  the  [0,  0]  shift.  The  subpeaks  with  magnitudes  less  than  the  primary 
peak  correspond  to  the  center  pixels  of  blocks  whose  that  are  shifted  versions  of  a 
block  of  interest  and  are  close  that  block  in  L2  distance.  This  observation  suggests 
that  it  may  be  possible  to  find  blocks  that  have  similar  content  by  considering  the 
location  of  the  peak  of  the  cross  correlation  of  two  blocks  that  are  close  in  L2  dis- 
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tance.  This  may  be  a  computationally  expensive  approach;  however,  good  results 
can  be  attained  while  minimizing  computational  load  by  using  1-D  projection-based 
cross-correlations  instead  of  2-D  cross-correlations. 

For  image  registration  applications,  Cain  et  al.  [11]  described  an  algorithm  that 
uses  the  x  and  y  projections  of  an  image  to  find  their  2-D  cross  correlation  peak.  In 
this  algorithm,  the  cross-correlation  peaks  of  the  one-dimensional  projections  of  two 
images  corresponds  to  the  x  and  y  coordinates  of  the  2-D  cross  correlation.  This  is 
shown  to  be  a  computationally-efficient  alternative  to  the  more  traditional  approach 
of  finding  the  two-dimensional  cross-correlation  peak  of  the  images.  In  order  to 
exploit  the  computational  efficiency  of  this  approach  in  the  XCD  algorithm,  the  x 
and  y  projections  are  of  each  block  are  calculated  as  a  preprocessing  step  when  the 
blocks  are  created.  Then,  for  each  Fjj,GS)j  pair  whose  measured  variance  is  below 
the  calculated  maximum  allowable  variance,  it  is  necessary  to  calculate  two,  1-D 
cross  correlations.  For  a  given  GS)t,  if  the  cross  correlation  peak  is  centered  in  both 
1-D  cross  correlations,  Gsj  is  included  in  the  average  of  similar  blocks. 

Another  set  of  blocks  with  (A2.)  <  a \  can  now  be  constructed  that  have 
centered  cross-correlation  peaks  for  both  row  and  column  projections.  This  set  is 
denoted  T  and  the  s,t  pair  corresponding  to  a  member  of  T  is  7.  Using  the  set  T,  a 
denoised  version  of  block  F,  j  can  be  constructed  as 

XCD( FfJ)  =  (Fy  +  E7erG„)  +  T  5]  Fy («,«).  (5.20) 

I  I  +  UGU,veV 

As  in  the  previous  section,  the  restored  image  is  constructed  as  an  average  of  all  the 
denoised  estimates  of  the  blocks. 

5.2.3  HOD  and  XCD  Simulation  Results.  This  section  presents  results  us¬ 
ing  LIDAR  and  standard  benchmark  images  as  truth  with  additive  Gaussian  noise. 
These  results  demonstrate  that  the  HOD  and  XCD  denoising  methods  can  suc- 
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Figure  5.12:  Graph  of  results  comparing  output  of  HOD  filtering,  XCD  filtering, 
Wiener  filtering  and  median  filtering  of  the  101  x  101  image  shown  in  Figure  5.5 
over  a  wide  range  of  noise  values.  Typical  block  sizes  (6  x  6)  are  chosen  for  the  HOD 
and  XCD  algorithm.  The  line  in  the  graph  labelled  “Input  PSNR  =  Output  PSNR” 
shows  the  point  where  denoising  methods  produce  results  with  lower  PSNR  than  the 
original  image.  The  HOD  and  XCD  methods  approach  but  do  not  reach  this  line. 

cessfully  denoise  images  with  results  that  are  consistently  favorable  to  Wiener  and 
median  filtering  and  on  par  with  many  wavelet  denoising  methods.  In  the  following 
discussion  and  graphs,  results  are  presented  using  PSNR  where  the  PSNR  of  an 
S  x  T  sized  image  is  defined  as: 


PSNR 


10  log 


(5.21) 


where  I  is  the  diffraction-limited  image,  I  is  an  estimate  of  the  image,  Imax  is  the 
maximum  value  found  in  the  image  I. 

The  performance  of  the  algorithm  was  examined  using  various  images.  The 
variable  parameters  in  these  simulations  were  the  individual  images,  the  variance 
of  the  AWGN  and  the  block  sizes  used  in  the  denoising  algorithms.  Results  are 
presented  for  each  of  the  two  algorithms. 
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Figure  5.13:  Block  size  vs.  HOD  out¬ 
put  for  a  101  x  101  image.  For  each 
block  size,  the  image  was  corrupted  with 
ten  noise  realizations  of  (Jnoise  =  HO 
(mean  input  PSNR  =  23.26)  and  de- 
noised.  The  maximum,  minimum  and 
mean  PSNRs  for  10  runs  using  each 
block  size  are  plotted. 


Figure  5.14:  Block  size  vs.  XCD  out¬ 
put  for  the  101  x  101  image.  As  in  5.13, 
the  image  was  corrupted  with  ten  noise 
realization  of  crnoise  =  110  (mean  input 
PSNR  =  23.26)  and  denoised.  The  max¬ 
imum,  minimum  and  mean  PSNRs  for 
10  runs  using  each  block  size  are  plot¬ 
ted. 
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Experimentation  began  with  the  LIDAR  image  shown  in  Figure  5.5  and  ex¬ 
amined  the  effectiveness  of  the  algorithm  with  various  block  sizes  and  noise  values. 
The  results  of  the  algorithms  across  a  range  of  input  noise  values  are  shown  in  Fig¬ 
ure  5.12.  This  graph  shows  the  results  in  comparison  to  optimally-sized  Wiener  and 
median  filters.  The  diagonal  lines  across  the  graph  indicates  the  point  where  the  out¬ 
put  PSNR  is  equal  to  the  input  PSNR.  To  the  right  of  this  line,  output  PSNRs  are 
less  than  input  PSNRs  indicating  that  Wiener  filtering  and  median  filtering  actually 
degrade  the  image.  On  the  left  hand  side  of  the  diagonal  line  lies  the  region  where 
all  three  algorithms  improve  the  PSNR  of  an  input  image.  The  HOD  and  XCD  al¬ 
gorithms  outperform  the  optimally-sized  Wiener  and  median  filters  (as  implemented 
by  MATLAB  7.1)  across  this  region. 

Output  performance  is  also  dependent  upon  block  size.  In  (5.17)  and  (5.20),  for 
any  given  block  F,  j  the  results  of  the  algorithm  are  dependent  upon  the  number  of 
blocks  G S)t  that  contribute  to  the  average  in  HOD(Fij)  and  XCD(Fjj).  Recalling 
that  the  data  model  is  D ij  =  \h]  +  Q,:y,  an  average  of  blocks  with  identical  I 
but  different  Q  is  expected  to  converge  to  I  as  the  number  of  blocks  increases.  In 
general,  at  small  block  sizes,  it  is  more  likely  to  find  blocks  Gs  t  that  are  similar 
in  underlying  content  to  F,:j;  however,  a  small  sample  size  is  not  as  likely  to  have 
higher-order  statistics  that  are  predicted  by  the  model.  As  the  block  size  increases, 
confident  in  the  statistics  increases,  but  it  becomes  less  probable  that  blocks  can  be 
found  with  matching  image  content. 

The  graphs  in  Figure  5.13  and  Figure  5.14  demonstrate  the  trade-off  involved 
in  choosing  block  size  for  the  image  shown  in  Figure  5.5  corrupted  by  AWGN.  The 
graphs  show  the  maximum,  minimum  and  mean  values  for  the  output  of  ten  different 
realizations  of  noise  across  various  block  sizes.  For  this  image  the  optimal  block  size 
for  the  HOD  algorithm  is  7  x  7  and  for  the  XCD  algorithm  is  9  x  9.  In  general, 
a  block  size  of  approximately  6x6  provides  good  results.  Figures  5.15  through 
5.23  show  output  results  using  a  512  x  512  image  created  using  frame  averaging 


Figure  5.15:  512  x  512  image  of  a 

tank  derived  from  L1DAR  data. 


Figure  5.16:  Image  of  Figure  5.15  cor¬ 
rupted  with  noise  of  a  —  9000,  input 
PS  NR  =  18.96. 


Figure  5.17:  Image  in  Figure  5.16  de- 
noised  using  HOD  and  a  block  size  of 
six.  Output  PSNR  =  33.17. 


Figure  5.18:  Image  of  Figure  5.17  de- 
noised  using  Wiener  filtering.  Output 
PSNR  =  26.23. 


of  LIDAR  data.  This  image  was  corrupted  with  Gaussian  noise  and  then  denoised 
using  HOD  and  Wiener  filtering.  Results  for  a  256  x  256  subregion  of  the  image  are 
shown  in  Figures  5.2.3  through  5.2.3.  Beginning  with  the  corrupted  image  shown 
in  Figure  5.20,  results  for  the  Wiener  filter  shown  in  Figure  5.2.3  and  for  the  HOD 
algorithm  in  Figure  5.2.3.  The  the  results  of  the  HOD  algorithm  are  not  only  better  in 
PSNR  but  are  visually  more  appealing  than  median  or  Wiener  filtering.  Figures  5.15 
through  5.23  show  the  results  of  filtering  the  entire  512  x  512  tank  image  with  AWGN 
of  a  —  9000  and  input  PSNR  =  18.96.  In  this  image,  the  advantage  of  HOD  over 
Wiener  filtering  are  even  more  apparent.  For  benchmarking  purposes,  the  algorithms 
were  also  applied  against  the  standard  images  Lena,  Barbara,  Boats,  House,  Peppers 
and  compared  with  other  denoising  algorithms  in  the  literature.  The  results  of  the 
HOD  and  XCD  algorithms  using  a  constant  block  size  of  6  x  6  are  shown  in  Tables 
5.1  and  5.2. 

The  exemplar-based  denoising  algorithm  described  Kervrann  and  Boulanger 
[31,32]  is  of  interest  because  it  is  a  non-transform  domain  algorithm  that  also  uses 
L2  distance  for  block  selection.  Comparative  results  for  the  Peppers  image  are  shown 
in  Figure  5.24.  Exemplar-based  denoising  performed  better  than  both  the  HOD  and 
XCD  algorithms  at  relatively  low  noise  levels  (cr  <  70)  but  did  not  perform  as  well 
at  the  higher  noise  levels  that  are  common  in  passive  infrared  and  LIDAR  imaging. 

The  algorithms  also  compared  favorably  to  the  SUREshrink  and  Bayeshrink 
wavelet  coefficient  shrinkage  algorithms  that  are  described  in  [19]  and  [14]  and  are 
evaluated  by  Chang  in  [14]  within  a  range  of  PSNRs  from  approximately  of  17  to 
28.  The  SUREshrink  and  Bayeshrink  algorithms  determine  and  apply  thresholds  to 
coefficients  in  the  wavelet  domain. 

Overall,  the  XCD  and  HOD  algorithms  compared  favorably  with  most  de¬ 
noising  algorithms  but  fell  short  of  the  reported  results  for  the  most  recent  wavelet 
coefficient  shrinkage  algorithms  that  examine  and  combine  neighborhood  statistics 
including  [15],  [17],  [42],  and  [44],  However,  the  algorithms  proposed  do  provide  a 
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a /Input  PS  NR 

Lena 

512x512 

Barbara 

512x512 

Boats 

512x512 

House 

256x256 

Peppers 

256x256 

10/28.14 

34.39 

33.10 

32.25 

35.05 

33.71 

20/22.10 

30.63 

27.86 

28.57 

30.88 

29.23 

30/18.57 

28.67 

24.99 

26.39 

28.63 

26.50 

50/14.16 

26.77 

22.93 

24.53 

25.72 

23.75 

75/10.64 

25.25 

22.19 

23.34 

24.47 

22.13 

100/8.14 

24.07 

21.55 

22.53 

23.45 

21.28 

125/6.20 

22.93 

20.93 

21.71 

22.43 

20.54 

150/4.59 

21.99 

20.29 

20.96 

21.60 

20.05 

Table  5.1:  Output  PSNRs  of  the  HOD  method  using  a  block  size  of  six  applied 
across  several  standard  images  with  additive  noise  of  varying  standard  deviation. 


a  /Input  PS  NR 

Lena 

512x512 

Barbara 

512x512 

Boats 

512x512 

House 

256x256 

Peppers 

256x256 

10/28.14 

34.45 

32.98 

32.46 

35.05 

33.72 

20/22.10 

30.77 

27.70 

28.72 

30.98 

29.42 

30/18.57 

28.91 

24.94 

26.75 

28.87 

26.54 

50/14.16 

26.93 

23.01 

24.80 

26.13 

24.26 

75/10.64 

25.30 

22.20 

23.46 

24.61 

22.50 

100/8.14 

24.06 

21.54 

22.56 

23.44 

21.36 

125/6.20 

22.89 

20.90 

21.70 

22.37 

20.56 

150/4.59 

21.94 

20.26 

20.94 

21.55 

20.05 

Table  5.2:  Output  PSNRs  of  the  XCD  method  using  a  block  size  of  six  applied 
across  several  standard  images  with  additive  noise  of  varying  standard  deviation. 

mechanism  for  implementing  neighborhood-based  denoising  in  a  manner  that  yields 
impressive  results  and  could  be  relatively  straightforward  to  implement  in  combina¬ 
tional  logic. 

5.3  Chapter  Summary 

This  chapter  of  the  dissertation  has  provided  a  review  of  recent  image  pro¬ 
cessing  literature  on  single-frame  image  denoising  and  developed  and  demonstrated 
three  similar  block-based  denoising  algorithms.  These  algorithms  exploited  different 
measures  of  similarity  between  blocks  than  those  used  by  other  denoising  algorithms 
in  the  literature.  The  algorithms  described  in  this  chapter  are  also  different  from 
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Method 

Lena 

512x512 

Barbara 

512x512 

Boats 

512x512 

House 

256x256 

Peppers 

256x256 

HOD 

24.07 

21.55 

22.53 

23.45 

21.28 

XCD 

24.06 

21.54 

22.56 

23.44 

21.36 

Exemplar  [31] 

23.32 

20.64 

21.78 

23.08 

20.51 

Median  Filter 

15.77 

15.33 

15.68 

15.63 

15.68 

Wiener  Filter 

15.46 

15.24 

15.43 

15.41 

15.39 

Table  5.3:  Results  of  various  methods  with  input  PSNR  =  8.14  using  a  block  size 
of  six  applied  across  several  standard  images  with  additive  noise  of  varying  standard 
deviation. 

those  described  in  the  literature  because  they  suspend  the  requirement  for  informa¬ 
tion  included  in  the  average  to  be  located  in  close  spatial  proximity  to  the  pixel 
being  denoised.  In  all  of  the  algorithms  reviewed  in  Section  2.5,  and  in  the  more  ba¬ 
sic  low-pass  and  median  filters  described  in  the  introduction,  close  spatial  proximity 
to  a  given  pixel  was  an  primary  consideration  in  selecting  other  pixels  to  include  in 
an  average.  The  algorithms  have  been  shown  to  achieve  better  results  than  many 
neighborhood  filters  by  suspending  this  requirement.  Computational  load  was  also 
reduced  in  a  number  of  areas  by  using  binary  weighting  schemes. 

Overall,  the  algorithms  worked  best  and  had  the  lowest  processing  times  when 
dealing  with  images  with  significant  amounts  of  noise  (e.g.  Input  PSNR  <  14).  With 
less  noise  in  an  image,  it  becomes  increasingly  difficult  to  find  blocks  that  are  close 
in  L2  distance  with  statistics  that  meet  algorithm  criteria.  This  observation  may 
indicate  a  fundamental  limit  on  denoising  methods  that  rely  on  image  statistics. 

One  of  the  more  interesting  aspects  of  both  the  methods  introduced  here  and 
the  techniques  reviewed  in  Section  2.5  is  the  that  the  smoothing  algorithms  all  gener¬ 
ally  produce  results  that  exceed  the  CRLB.  It  is  a  relatively  simple  exercise  to  show 
that  the  maximum-likelihood  estimate  of  an  image  I  which  corrupted  with  AWGN 
with  variance  cr2  is  D.  It  is  also  fairly  simple  to  show  that  the  variance  on  an  esti¬ 
mate  of  a  pixel  in  the  image  is  also  a2.  Block-based  or  other  smoothing  algorithms 
provide  estimates  that  are,  for  most  images,  much  better  than  a  straightforward 
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CRLB  calculation  would  predict  by  introducing  additional  assumptions  about  the 
frequency  content  of  natural  images.  In  particular,  the  underlying  assumption  is 
that  most  natural  images  contain  predominantly  low-frequency  spatial  content  and 
that  by  averaging  using  linear  or  nonlinear  filters  a  reasonable  estimate  of  the  images 
under  study  may  be  provided. 

In  addition  to  the  methods  noted  in  Section  2.5,  many  state-of-the-art  wavelet 
coefficient  shrinkage  denoising  methods,  including  those  discussed  in  [17],  [44],  [42] 
among  others,  rely  on  combining  of  neighborhood  information  in  the  wavelet  trans¬ 
form  domain.  Regardless  of  the  domain  used,  in  the  presence  of  noise,  the  ability 
to  combine  information  from  these  neighborhoods  is  subject  to  some  degradation. 
Follow-on  work  may  include  investigation  into  the  fundamental  performance  limits 
encountered  by  these  algorithms. 
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Figure  5.19:  Original  256  x  256  LI-  Figure  5.20:  Tank  image  with  Ad- 

DAR  image  of  a  tank  resulting  from  a  ditive  White  Gaussian  Noise,  crnoise  = 
multiframe  average.  5000,  input  PSNR  =  19.33. 


Figure  5.21:  Tank  image  with  Ad¬ 

ditive  White  Gaussian  Noise,  (input 
PSNR  =  19.33)  after  Wiener  filtering. 
Output  PSNR  =  26.43. 


Figure  5.22:  Tank  image  with  Ad¬ 

ditive  White  Gaussian  Noise,  (input 
PSNR  =  19.33)  after  HOD  filtering  with 
block  size  of  5,  output  PSNR  =  31.00. 
Note  that  in  addition  to  reducing  sta¬ 
tionary  AWGN,  non-stationary  readout 
noise  has  also  been  reduced. 
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Figure  5.23:  The  “method  noise”  derived  by  subtracting  the  denoised  image  found 
in  Figure  5.17  from  the  original  noisy  image  found  in  Figure  5.16.  Note  the  absence 
of  feature  content  in  this  image.  The  actual  value  of  a  was  8976.  The  measured 
value  of  a  in  this  method  noise  is  8801. 


Figure  5.24:  Results  of  the  HOD  and  XCD  methods  compared  against  the 

Exemplar-based  image  denoising  algorithm  described  in  [31]  using  results  reported 
in  [31]. 
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Figure  5.25:  Results  of  the  HOD  and  XCD  methods  compared  against  the 

Bayeshrink  and  SUREshrink  algorithms  described  in  [14]  and  [19]  using  results  re¬ 
ported  in  [14]. 
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VI.  Conclusions 


This  dissertation  has  introduced  research  into  image  registration  and  single¬ 
frame  denoising  which  has  yielded  novel  image-processing  algorithms  and  im¬ 
proved  general  theoretical  understanding  of  bounds  on  the  performance  of  shift  esti¬ 
mators.  The  dissertation  introduced  methods  to  improve  image  quality  and  explored 
the  theoretical  limits  of  an  algorithm’s  ability  to  achieve  these  improvements.  In  this 
final  chapter,  the  results  from  the  previous  chapters  are  summarized  and  additional 
research  is  proposed  that  can  extend  the  efforts  described  in  the  earlier  chapters  of 
this  dissertation. 

This  chapter  is  organized  as  follows:  In  Section  6.1  a  summary  of  the  significant 
contributions  of  Chapters  III,  IV  and  V  is  provided.  Then,  in  Section  6.2,  areas  that 
are  believed  to  yield  fruitful  research  that  will  extend  the  work  performed  in  this 
dissertation  are  discussed. 

6. 1  Summary  of  Results  and  Contributions 

This  section  provides  an  overview  of  contributions  from  the  dissertation. 

6.1.1  Review  of  Results  in  Chapter  III.  Chapter  III  provided  a  method 
for  improving  the  performance  of  projection-based  image  registration  algorithms  at 
minimal  computational  cost.  It  explained  how  a  low-pass  filtering  can  be  designed 
to  exploit  spatial  correlations  in  an  image  and  improve  the  performance  of  image 
registration  algorithms.  It  also  described  experiments  conducted  with  actual  test 
data  that  have  confirmed  our  analytical  results. 

The  major  contributions  of  this  chapter  included  a  generalization  and  modifi¬ 
cation  of  the  FOM  of  Cain  et  al.  [11]  so  that  it  could  be  applied  to  image  projections 
containing  correlated  noise.  This  was  necessary  to  appy  the  FOM  to  filtered  projec- 
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tions  the  filtered  case.  This  new  FOM  was  then  used  in  a  procedure  for  finding  the 
length  of  a  boxcar  filter  which  was  applied  to  the  projections  of  an  image. 

A  second  contribution  of  this  chapter  was  the  use  of  the  FOM  in  a  procedure 
that  optimized  a  boxcar  filter  to  minimize  the  mean-squared  error  of  shift  estimates 
created  from  projections  of  an  image.  The  chapter  showed  that  when  uncorrelated 
noise  is  added  to  an  image,  the  covariance  structure  of  a  projected  image  remains 
largely  intact.  A  procedure  for  estimating  the  covariance  function  from  available 
noise-corrupted  data  was  introduced.  The  covariance  function  was  then  used  in 
conjunction  with  the  revised  FOM  to  find  the  optimal  length  of  boxcar  filter  that 
minimized  the  mean-squared  error  of  shift  estimates  even  in  low-SNR  environments. 

The  filters  were  compared  with  other  low-pass  filters  and  shown  to  be  both 
computationally  efficient  and  effective.  Results  showed  an  improvement  by  factors 
up  to  5.5  in  mean-squared  error  and  a  reduction  by  at  least  0(N)  in  computational 
complexity  from  2-D  methods.  Further  computational  advantages  were  also  dis¬ 
cussed  for  this  filtering  method  compared  with  other  filtering  methods  described  in 
the  literature  for  reducing  the  mean-squared  error  of  shift  estimates. 

6.1.2  Review  of  Results  in  Chapter  IV.  Chapter  IV  presented  a  calculation 
and  comparison  of  theoretical  performance  bounds  for  image  registration  algorithms. 
It  showed  that  for  large  images  under  conditions  of  full-frame  registration,  the  CRLB 
is  an  adequate  measure  of  performance  for  most  realistic  imaging  conditions.  For 
projected,  optically  filtered,  or  small  images,  however,  the  CRLB  may  not  sufficiently 
predict  bounds  on  the  performance  of  a  registration  algorithm  and  the  Barankin 
bound  was  introduced  as  a  method  for  providing  a  more  accurate  estimate. 

Chapter  IV  first  examined  the  one-dimensional  case  of  filtered  projections  and 
derived  analytical  expressions  for  the  CRLB  and  Barankin  bound  of  a  shift  estimate 
for  two  filtered  and  projected  images  of  the  same  scene.  This  was  compared  with  the 
CRLB  and  Barankin  bound  of  shift  estimates  generated  from  2-D  data.  The  results 


for  the  1-D  case  were  shown  to  be  have  much  higher  lower  bounds  on  mean-squared 
error  and,  that  for  low-SNR  projected  images,  there  was  a  significant  difference 
between  the  mean-squared  error  predicted  by  the  CRLB  and  that  predicted  by  the 
Barankin  bound. 

The  chapter  then  examined  and  compared  the  effect  of  focal-length  errors  on 
the  lower  bounds  of  the  mean-squared  error  of  shift  estimators.  For  the  test  images, 
the  CRLB  and  Barankin  bounds  increased  as  the  severity  of  focal-length  errors  in¬ 
creased  in  the  simulations.  In  defocused  imagery,  the  Barankin  bound  provided  a 
higher  estimate  of  SNR  than  that  predicted  by  the  CRLB  even  at  moderate  noise 
levels. 

It  is  also  worth  noting  that  Cain  et  al.  [11]  show  that  shift  estimates  for  low 
intensity  images  in  the  presence  of  fixed  pattern  noise  may  actually  be  better  using 
projections  than  full  2-D  estimates.  Bounds  on  this  behavior  would  also  make  an 
interesting  future  study. 

Perhaps  the  most  interesting  aspect  of  the  research  documented  in  this  paper 
was  that  the  bounds  under  study  were  most  applicable  to  distortions  of  small  images. 
In  many  target-recognition  applications,  objects  being  imaged  may  be  rotated  and 
scaled  and  salient  features  may  be  extracting  using  various  filtering  techniques.  This 
observation  suggests  many  follow-on  applications  which  are  discussed  next  in  Section 
6.2. 


6.1.3  Review  of  Results  in  Chapter  V.  In  Chapter  V,  three  similar  block- 
based  denoising  algorithms  were  developed  and  demonstrated.  These  algorithms 
identified  similar  regions  within  a  single  image  that  could  be  used  to  create  block 
averages  in  a  way  that  was  similar  to  the  multiframe  averaging  facilitated  by  image 
registration  in  previous  chapters.  The  new  algorithms  exploit  different  measures 
of  similarity  between  blocks  than  those  used  by  other  denoising  algorithms  in  the 
literature  and  use  efficient  binary  weighting  schemes  in  their  block  averages.  These 
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algorithms  are  also  different  from  those  described  in  the  literature  because  they 
suspend  the  requirement  for  information  included  in  the  average  to  be  located  in  close 
spatial  proximity  to  the  pixel  being  denoised.  The  chapter  showed  that  in  low-SNR 
situations  these  algorithms  could  achieve  better  results  than  many  neighborhood 
filters  by  suspending  this  requirement.  Computational  load  was  also  reduced  in  a 
number  of  areas  by  using  binary  weighting  schemes. 

The  three  algorithms  introduced  in  Chapter  V  were  the  Gaussian-Detection 
Denoising  (GDD)  algorithm,  the  Higher-Order  Denoising  (HOD)  algorithm  and  the 
Cross-Correlation  Denoising  (XCD)  algorithm.  The  GDD  algorithm  attempted  to 
identify  similar  blocks  by  evaluating  whether  or  not  their  error  functions  belonged 
to  a  Gaussian  distribution.  This  algorithm  did  remove  some  noise  in  the  image 
but  was  less  successful  than  other  methods,  including  the  Wiener  filter.  This  was 
not  the  case  with  the  HOD  algorithm.  This  algorithm  looks  at  the  second  moment 
of  the  block  and  pairs  of  blocks  whose  error  function  has  a  third-order  moment 
indicative  of  a  Gaussian  distribution.  This  algorithm  consistently  outperformed 
low-pass  filtering  techniques  including  the  Wiener  filter  and  was  shown  to  be  on- 
par  with,  and  in  some  cases  better  than,  other  noise  reduction  algorithms  found  in 
current  literature.  The  XCD  algorithm  was  the  third  denoising  algorithm  developed 
and  demonstrated  in  this  chapter.  This  algorithm,  attempted  and  succeeded  in 
achieving  the  performance  of  the  HOD  algorithm,  while  using  block  projections  to 
reduce  the  processing  requirements  of  the  computations.  Most  of  the  steps  of  the 
XCD  algorithm  were  identical  to  those  in  the  HOD  algorithm;  however,  instead  of 
calculating  the  skewness  of  the  error  vector  between  two  blocks,  the  projections  of 
the  blocks  were  used  to  calculate  their  cross  correlations  and  observe  the  location  of 
the  cross-correlation  peak. 

Overall,  the  algorithms  worked  best  and  had  the  lowest  processing  times  when 
dealing  with  images  with  significant  amounts  of  noise  (e.g.  input  PSNR  <  14  dB). 
With  less  noise  in  an  image,  it  becomes  increasingly  difficult  to  fold  blocks  that  are 
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close  in  L2  distance  with  statistics  that  meet  our  criteria.  This  observation  may 
indicate  a  fundamental  limit  on  denoising  methods  that  rely  on  image  statistics. 
Potential  follow-on  work  to  the  work  described  in  this  chapter  is  discussed  in  depth 
in  Section  6.2  but  may  include  investigation  into  the  fundamental  performance  limits 
encountered  by  these  algorithms. 

6.2  Recommended  Future  Research 

This  section  outlines  additional  research  efforts  that  could  be  taken  to  extend 
the  work  described  in  this  dissertation.  Further  research  is  described  that  could  be 
performed  in  the  areas  of  image  registration,  bounds  on  registration  performance 
and  block-based  denoising. 

6.2.1  Image  Registration.  Chapter  III  discusses  how  image  smoothing  and 
bias  reduction  are  used  jointly  to  improve  the  performance  of  image  registration 
algorithms  [3,22,50].  In  Chapter  III  it  was  shown  that  the  smoothing  portion  of  the 
filtering  could  be  accomplished  using  a  low-pass  filter  to  eliminate  noise.  The  chapter 
also  proposed  that  bias  reduction  in  the  algorithm  could  be  performed  optically  and 
simulated  this  optical  filtering  in  experiments.  The  chapter  did  not  attempt  to 
quantify  the  exact  parameters  for  the  defocus  that  would  be  required  to  perform 
this  optical  filtering.  This  presents  another  opportunity  for  future  research. 

Based  on  the  research  in  this  chapter,  it  is  possible  to  design  a  two-lens  system 
(and  possibly  systems  using  more  than  two  lenses)  for  motion  estimation  that  could 
effectively  create  two  images  that,  when  differenced,  would  produce  a  bias-free  image 
that  could  be  reliably  registered  with  a  fast  correlation-based  algorithm.  If  sensor 
noise  is  a  concern,  the  differencing  of  the  image  and  its  optically  low-pass  filtered 
content  would  effectively  double  the  amount  of  noise  that  would  need  to  be  mitigated. 
This  optical  differencing,  as  was  noted  in  Chapter  III  also  produces  images  that  have 
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most  of  their  power  concentrated  in  the  edges  of  the  objects  in  the  scene.  This  is 
also  an  area  that  may  be  fruitful  for  future  exploration. 

Another  area  that  could  be  studied  in  the  future  is  the  analysis  and  mitigation 
of  bias  in  different  image  registration  algorithms.  Robinson  and  Milanfar  discuss 
estimator  bias  for  gradient-based  image  registration  in  [46].  There  are,  however, 
a  number  of  different  ways  to  register  images  including  methods  based  on  cross¬ 
correlations  as  described  in  Chapter  Ill,  methods  based  on  mutual  information  [43], 
and  methods  based  on  landmarks  within  images  [27].  A  comparison  of  these  regis¬ 
tration  methods  and  their  inherent  estimator  biases  would  be  an  excellent  starting 
point  for  another  dissertation  on  image  registration. 

6.2.2  Bounds  on  Registration  Performance.  A  final  area  for  future  research 
is  the  possibility  of  employing  the  Barankin  bound  in  the  area  of  automatic  target 
recognition  (ATR).  One  of  the  interesting  facets  of  the  Barankin  bound  that  is  used 
in  Chapter  IV  is  that  it  may  be  used  to  measure  the  bound  on  estimating  a  shift 
given  other  shift  scenarios  that  are  slightly  different  and  represent  the  most  probable 
sources  of  shift  error.  In  this  dissertation,  this  was  used  to  explore  registration 
estimates  of  projected  images;  however,  it  may  also  be  possible  to  extend  this  work 
to  automatic  target  recognition.  For  example,  Driggers  et  al.  [20]  test  the  ability 
of  a  group  of  human  test  subjects  to  differentiate  between  several  similar  armored 
vehicles  under  different  noise  conditions  and  sampling  rates.  Using  the  Barankin 
bound,  it  may  be  possible  to  calculate  information-theoretic  bounds  on  the  ability  to 
differentiate  between  several  similar-looking  targets  that  represent  the  most  probable 
errors  to  a  target  recognition  problem.  This  should  be  a  relatively  uncomplicated 
extension  to  this  work  but  one  that  may  provide  a  new  way  of  looking  at  the  ATR 
problem. 

This  work  on  the  theoretical  bound  on  image  registration  performance  can  also 
be  expanded  to  include  more  dynamic  cases  such  as  the  registration  of  images  under 
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conditions  of  rotation  and  dilation.  For  a  given  scene,  using  the  linear  operator 
notation  that  used  in  this  dissertation,  a  dilation  or  a  rotation  can  be  modelled 
as  yet  another  filtering  and  sampling  operation.  These  bounds  have  been  explored 
to  some  extent  in  [55],  however,  it  may  be  useful  to  examine  the  Barankin  bound 
of  images  under  these  same  conditions.  In  fact,  because  most  imaging  sensors  are 
square  arrays  of  detectors,  any  rotation  of  an  imaging  sensor  necessarily  changes  the 
amount  of  mutual  information  in  two  images  of  the  same  scene.  For  some  scenarios, 
this  effect  may  be  negligible;  however,  for  remote  sensing  applications,  this  change 
in  information  may  make  a  substantial  difference.  Bounds  on  this  type  of  estimation 
are  another  area  that  provide  an  opportunity  for  additional  research  with  military 
applications. 

6.2.3  Block-Based  Denoising.  As  mentioned  in  Chapter  V,  the  smoothing 
algorithms  from  the  literature  and  those  introduced  in  this  dissertation  all  generally 
produce  results  that  exceed  the  CRLB.  Block-based  or  other  smoothing  algorithms 
provide  estimates  that  are,  for  most  images,  much  better  than  a  straight  forward 
CRLB  calculation  would  predict  by  introducing  additional  assumptions  about  the 
frequency  content  of  natural  images.  In  particular,  the  underlying  assumption  is  that 
most  natural  images  contain  predominantly  low-frequency  spatial  content  and  that 
by  averaging  using  linear  or  non-linear  Liters,  a  reasonable  estimate  of  the  images 
under  study  can  be  provided. 

One  interesting  potential  extension  to  this  research  is  a  calculation  of  the  the¬ 
oretical  bounds  on  the  performance  of  image  smoothing  algorithms.  One  way  to 
approach  this  problem  would  be  to  introduce  additional  assumptions  about  image 
content  by  modeling  the  diffraction-limited  image  as  something  like  a  Gibbs  distribu¬ 
tion  [48].  Alternatively,  it  may  be  possible  to  model  the  image  by  modeling  its  local 
variation  using  techniques  such  as  those  described  in  [1,5,6,8,13]  and  to  calculate 
a  bound  from  this  localized  structure.  Since  all  of  these  methods  assume  that  there 
are  local  image  variations  and  attempt  to  account  for  them,  it  may  be  possible  to 
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create  either  an  algorithm-specific  or  generalized  bound  for  denoising  a  particular 
image. 
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Appendix  A.  Important  Derivations 


This  appendix  provides  the  derivations  of  important  results  used  in  this  disser¬ 
tation. 

A.l  Calculation  of  the  FOM  Used  in  Chapter  III 

Cain  et  al.  in  [11]  introduced  a  figure  of  merit  (FOM)  which  was  modified  in 
(3.10)  to  account  for  the  effects  of  filtering  in  the  projections.  This  appendix  shows 
how  (3.10)  can  be  used  to  derive  (3.12).  For  a  given  image,  the  numerator  and  the 
denominator  of  (3.10)  are  examined  separately. 

The  derivation  below  generalizes  some  of  the  random  variables  using  the  nota¬ 
tion  Nn(0,  K)  to  indicate  a  Gaussian  random  variable  with  zero  mean  and  variance 
K .  Since  some  of  these  random  variables  will  be  combined  in  the  course  of  the 
derivation,  the  numeric  subscript  n  G  {1,2}  is  used  to  indicate  the  frame  of  data 
associated  with  it.  This  allows  tracking  independence  of  random  variables  as  they 
are  combined  to  achieve  the  desired  results. 

Turning  first  to  the  numerator,  if  the  projections  of  two  images  are  examined 
over  a  number  of  trials,  the  ensemble  average  of  the  cross  correlation  of  these  two 
filtered  projections  can  be  written  as 

p.  =  (hw  *  \y  -  w'ly)  *  Wf(hm  *  iy  -  wiy).  (A.l) 

Points  on  the  projection  corresponding  to  the  precise  alignment  of  the  two  filtered 
projections  and  a  shift  of  1  can  be  expressed  as 

(Pz(0))  =  (hu,  *  iy  -  wiy )J  Wf(K,  *iy-  Wly), 

(p*(l))  =  (hw  *  h  -  wiy)1  Wf  (hu,  *  Tai y  -  wTaiy)  |a=1 ,  (A.2) 
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where  circular  shifting  is  assumed.  If  {L  =  ^=1  Wf(n,n)},  the  difference  between 
these  two  points  can  be  written  as 

(Pz(0))  -  (p*(1))  =  (hw  *  iy  -  wiy)1  Wf(hw  *  (i y  -  TJy)  -  wiy  -  «;TQiy)|Q=1 , 

~  (Il„,  ^  1  y  Wiy')  (Il^j  ^  (ly  TQly)  Wiy  ^T^ly) 

^  a  =  l 

(A.3) 

Noting  that  hltering  correlates  adjacent  terms  in  the  filtered  projections  and  again 
assuming  circular  shifting,  this  can  be  further  reduced  to 

<p«(0))  -  (P,(l)>  «  ^(||i5||2-(vT„i,> 

-  (hw  *iy,wiy)  +  (hw  *  i y,  wTaiy) 

+  (wiy,  Wly)  -  (wiy,WTJy))\a=w, 

~  (Hull2  -  (iy,Taiy)  -  (wiy, wiy) 

+  (wiy,  WTaiy)  +  (wiy,  wiy ) 

-  (wiy,wTaiy))\a=w , 

«  L  (VAR[i]  —  COV(iTTai)) \a=w  , 

«  LN2  (VAR [I]  -  COV(%))  .  (A. 4) 

In  the  denominator  of  (3.10),  the  effect  of  noise  on  the  FOM  is  accounted  for.  To 
do  this,  the  effect  of  the  addition  of  AWGN  of  variance  cr2  on  the  expected  value 
of  the  variance  of  the  difference  between  two  points  on  the  cross  correlation  of  the 
two  projections  is  examined.  The  data  in  the  projection  is  represented  as  = 
iy  +  qn  where  qn  is  a  vector  of  Gaussian  random  variables  Nre(0,  <r2./V).  Using  this 
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formulation,  and  again  assuming  circular  shifting, 


E[VAR[pz(0)  —  pz(l)]]  =  E[VAR[(hw  *  (i,  +  qi)  -  wdliZ)T 


Wf((iy  —  T^iy)  +  (q2  +  TQq2 


a=w  ’ 


=  VAR 


'N- 1 


W f(n,n)  (Ni(0,  wa2N)(iy  -  (Taiy)) 


n= 0 


+Ni(0,w2iY)K2(0,2u2iV) 


+  <(hw  *  iy)  -  wdi)Z)K2(0,  2 a2N))]  \a=w  , 

«  VAR  pST^O,  w2LW((i  -  (Tai))2»  +  3ST(0,  2w4LV2)]  , 
«  w2LV2(2(VAR(I)  -  COV(IH)  +  a2)  (A.5) 


Combining  (A. 4)  and  (A.5)  and  normalizing  by  the  size  of  the  filter,  w,  yields  the 
desired  result 


Fpy(0,  -1) 


(LN2  (VAR[I]  -  COV(I|ic)))2 
wV2LV2(2(VAR(I)  -  COV(%))  +  a2) 


(A.6) 


A. 2  Derivation  of  Theoretical  Performance  Bounds 

A. 2.1  Derivation  of  the  CRLB  for  a  Projected  &  Filtered  Image.  For  a 
given  image,  the  terms  (4.9),  (4.10)  and  (4.11)  are  derived  as  follows: 


1.  Taking  the  natural  logarithm  of  (4.7)  yields 


lnp(fij3/,  f2,j/|ij/,  ok)  =  —N  ln[27r<72lV] 


2  au 


2a2N 

(f 2,y  -  HpTyywy(fy  -  Hpivy 
2  a2N 


(A.7) 


Taking  the  partial  derivative  with  respect  to  a  produces 


d\np(flty,i2ty\\y,a) 

da 


a2N 


(f2,8-HpTt,i,)TW-l9H^“1».  (A. 8) 
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Then, 


d 2  lnp(fiiy,  f2>3/ |iy ,  op 
da2 


(f  —  HT  i 

U2>2/  W  Qoi2 


SHpT^iy 


i  ( <9HpT(lij 


(A.9) 


Using  the  differentiation  property  of  the  Fourier  Transform,  it  can  be  shown 
that  7^i(x  —  a)  =  —  a)  and  ^i(x  —  a)  =  —  a).  Using  this 

relationship,  the  differentiation  can  be  changed  to 


d 2  lnp(flij/,f2,j/|iy,Q) 
da2 


(c  _  it  rp  •  ^pTab 

U2.j/  ^pTa's)  W  Qx2 


(9HpTaiy 


_i  ( <9HpTai4 


(A.10) 


Taking  the  negative  of  the  expectation,  it  is  found  that 


d 2  lnp(fiiY,  f2,y|iy,  a) 


1  ( iff  IpT.,iv 


a2N  V  dx 


i  (  iff  IP  I  .  i? 


Recalling  that  W-1  =  (HprHp)  1  =  (Hp)_1(HpT  )  1  and  employing  the  com¬ 
mutative  operator  Dx  leads  to 


d2  lnp(fi>y,  f2>y|iy, «) 


cr2A’^p  HpToD^iyll  , 


(A.ll) 
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2.  Equation  (A. 8)  is  then  employed  to  find  the  elements  of  the  vector  JQX  as: 


J«x 


-E 


-  ( — (f 


d\y  \  <r2N 


2,y 


HpTaiy)  W 


Tns7— 1^-^-pTaiy 


da 


E 


<J2N 


(f2,y  -  HpTQiy)TW 


_l(92HpTaiy 


dadL 


^_19HpTQ,i.y 


dL 


1 


a2N 

1 


(HpTq)tW_1 


da 

dHpTgjy 

da 


TqtHpt(Hpt)  'Hp-'HpT^^, 


a2N 

1  T>  • 

'^NVxly- 


(A.12) 


3.  Finally,  differentiation  with  respect  to  the  nuisance  parameters  themselves  pro¬ 
duces  Jxx 


=  -E 


(fi,y  —  Hpiy)TW-1(fi)y  Hpi: 


2u2N 


(hy  “  HpTQiy)rW-1(f2,y  -  HpTQi 


=  -E 
1 


2a2N 

d  f  2HpTW-1(f1)y  -  Hpiy)  2(HpTQ)TW-1(f2,y-HpTQiy 


+ 


a2N 
(  2Z 


[diy  V  2cr2N 

(HptW-1Hp  +  T0!THpTW_1HpTQ)  , 


2cr2N 


V  a2N 


(A.13) 


where  Z  is  the  identity  matrix. 

Using  (A. 11),  (A.12),  and  (A.13)  it  is  now  a  simple  matter  to  block  invert  (4.8)  using 
the  Schur  information  complements  Sx  and  Sa  as 


J1  = 


(A.  14) 


109 


where 


c  —  T  —  T  ~\  ~l~\  1 

^ ct  u  aa  u  axu  xx  Jax  •> 


G  —  T  _  J  T  -i  —  It 

uxx  ax  °  a.a  ^  ax • 


(A. 15) 
(A. 16) 


Since  the  purpose  of  the  derivation  is  to  find  the  CRLB  of  the  shift  estimate,  the 
other  elements  of  the  FIM  can  be  ignored  to  solve 


Sa 


-i 


cr2N 


W^xh 
=  2<j2N 


j  D  i  \  ’ 

2  \-Lyx1y’)  xLy / 


X)  i  I  2 

xLy  | 


-1 


(A.  17) 
(A.  18) 


A. 2.2  Derivation  of  the  Two-Dimensional  CRLBs.  For  the  2-D  case,  the 
natural  logarithm  of  the  joint  probability  of  two  image  frames  represented  as  vectors 
(1.8)  and  (1.9)  can  be  written  as 

lnp(D1,D2|I)  =  -^(Di-HpHjfW-^Di-HpHoI) 

-i(E>2  “  HeHoTa,/3I)TW-1(D2  -  HPH0T„^I) 

+ constant .  (A.  19) 


Taking  the  partial  derivative  of  the  log  likelihood  equation  (A.  19)  with  respect 
to  ol  produces 


<91np(Di,  D2|I,  a,  /3) 
da 


^(D2-HtlHoT, 

cH 


a)Tw-i^HpHoT^I_ 

da 


Then,  the  differentiation  property  of  the  Fourier  Transform  is  again  employed,  (i.e. 
—  a)  =  -J^i(x-a)  and  -J^i^x  —  a)  =  jAi(x-a))  to  change  the  differentiation 
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to 


<92  lnp(D1:  P2 1 1,  ct,  (3) 
da2 


1  /D?w_,92HpH0T„,flI 

c2  \ 


da 2 


dUpnoTc^iy  W-1  faHpHoT^I 


dx 

-  (HpH0TQ)/3I)tW- 


dx 

igHpHoT^I 

<9a2 


(A.21) 


Taking  the  negative  expectation  leads  to 


-E 


a2  In  p(D ! ,  D2 1 1  ■  ck ,  /3) 
<9a2 


=  -2||DXH0I| 


(A. 22) 


Similarly, 


-E 


<92  lnp(D1,  D2|I.  a,  /9) 

d/32 


=  -jlPyHoIl 


cr 


(A.23) 


Then,  beginning  with  (A. 20) 


^  lnp(Dx,  P2|I,  a,  (3) 
dad/3 


(Tz 


Di’W 


i^HpHoT^I 

dad  (3 


aHpH0Ta^lV  ^HpHoT^I 

5a 


d(3 

-(HpH0Tai/3I)2W 


rw-.a2HpH0Ta,„I 


dad/3 


(A. 24) 


Taking  the  negative  expectation  and  changing  the  variables  of  differentiation  yields 


52lnp(D1,D2|Lx,y) 

dad(3 


(lAIIoI.  D„H0I) . 


(A. 25) 


Differentiating  with  respect  to  the  nuisance  parameters,  again  return  to  (A. 20)  and 
find, 


52  lnp(D1;  P2|I.  a,  (3) 
dadl 


(7 


(HpH0T^)tW 


_i  ^HpHpTa^I 

da 


(A. 26) 
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Completing  the  differentiation,  changing  the  variable  of  differentiation,  and  taking 
the  negative  expectation  produces  the  results 


E 


92lnp(Dl!D2|I  ,_a10) 
dad  I 


= - -(HpH0T«i/3)tW 

-  -  '.Ho'lhllol 

a2 


.^HpHpT^I 

dx 


(A. 27) 


and  similarly 


E 


d2  lnp^DolI.a,/?) 
dfidl 


— H0tDj,H0I. 
a2 


(A. 28) 


Finally,  derive 


-E 


|^lnp(D1,D2|I) 


=  E[l|  [(D1  -  HpH0I)TW-1(HpH0) 
+(D2  -  HpH0TQwgI)TW_1(HpH0TQi(a)] 
=  ^diag((HpH0)TW_1HpH0 

+  (HpH0TQ,/3)rW-1(HpH0Ttt^)), 

=  ^diag  (H03  Hq) 


(A. 29) 


If  a  shift  vector  is  defined  7  =  [a,  /3],  a  block  FIM  J  can  be  created  of  the  form: 


7  T 

U  ryry  /yQ 

J70  Joo 


(A. 30) 
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where,  for  the  filtered  images,  combining  results  from  (A. 22),  (A. 23),  and  (A. 25) 
gives 


||£*H0I| 


(D,H0I.  DyH0I) 


'77 


9  70 


G 


(“D^HoI,  DyH0I)  IID^HJI 


1  r 


H0tD2H0I  H  0tD„H0I 


— diag  (H0tH0) 


Inverting  the  FIM  yields 


r— 1 


S7“1 

a  1 7  1  7 

^70  u  77 


T  -1T  S  " 

^7' y  70  ‘'-'o 

a  -1 


where 


S7  —  J  77 


T  7  _1T  T 

7o«J  00  J70  5 


S  —  T  —  TiT_iT 

0  00  u70  °  77  ^70* 


Then 


s7-x 


=  2  o'2 


||D.tH0I||2  (DxH0I,  VyH0I) 
(D,H0I.  D„H0I)  ||D:yH0I||2 


-1 


(A.31) 

(A. 32) 
(A. 33) 


(A. 34) 


(A. 35) 
(A. 36) 


(A. 37) 
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Appendix  B.  Mathematical  Background  and  Related  Theory 


This  appendix  describes  some  of  the  basic  mathematical  concepts  and  notational 
conventions  used  throughout  this  paper.  It  describes  some  of  the  statistical 
underpinnings  of  image  denoising  problems  and  describe  the  notation  that  is  used 
commonly  in  the  dissertation  to  address  the  related  problems  of  block-based  image 
denoising  and  image  registration. 

Section  B.l  provides  a  discussion  of  the  chi-square  distribution.  The  section 
provides  a  discussion  of  some  of  the  salient  features  of  the  chi-square  distribution 
and  a  discussion  of  how  it  occurs  in  image  denoising  problems.  This  dissertation  also 
uses  the  correlation  structure  of  an  image  as  an  exploitable  underlying  property  of  a 
noisy  set  of  data.  Section  B.2  describes  the  notation  and  methods  used  for  measuring 
the  correlation  structure  of  the  projections  of  an  image  and  of  an  entire  image. 
The  Cramer-Rao  lower  bound  and  Barankin  bound  are  two  established  methods 
for  estimating  bounds  on  the  mean-squared  error  of  a  parameter  such  as  alignment. 
Therefore,  Section  2.3  reviews  the  Cramer-Rao  lower  bound  and  the  Barankin  bound. 
Finally,  Section  2.4  describes  the  method  used  to  model  defocus  errors  in  an  optical 
system. 

B.l  The  Chi-Square  Distribution 

This  section  describes  the  chi-square  distribution  and  provides  a  working  un¬ 
derstanding  of  it  that  is  of  fundamental  importance  in  describing  and  understanding 
the  denoising  algorithms  in  this  dissertation.  Although  a  thorough  description  of  the 
characteristics  of  the  noncentral  chi-square  distribution  is  contained  in  [29]  among 
other  places,  it  is  worthwhile  to  discuss  the  significance  of  this  distribution  within 
the  context  of  the  image  denoising  problem.  As  noted  in  [18],  [28],  and  [29],  the 
noncentral  chi-square  distribution  is  the  PDF  of  the  sum  of  normally-distributed, 
unit- variance,  squared  random  variables  ay  with  means  /i*.  If  a;  is  a  random  variable 
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with  a  chi-square  distribution,  then 


x  =  S^Jxl  (R1) 

i=  1 

In  this  equation  the  parameter  v  is  referred  to  as  the  “degrees  of  freedom”  of  the 
variable.  The  sum  of  the  squares  of  the  means  forms  the  “noncentrality  parameter” 
of  the  chi-square  distribution.  This  parameter  is  denoted  as  A  where 

V 

A  =  J>,2.  (B.2) 

i= 1 


The  PDF  of  the  distribution  may  be  written 


p(x ) 


J2 


2U/2 


k= 0 


(t)* 

a  r(-|  +  *)’ 


x  >  0, 


(B.3) 


where  the  Gamma  function,  T(w)  is  defined  as 


T(u)  =  /  tu~Le^dt. 

J  o 


(B.4) 


The  noncentral  chi-square  distribution  has  the  attributes 


E[x]  =  v  +  A,  (B.5) 

VAR  [re]  =2  u  +  4A.  (B.6) 

This  distribution  is  denoted  using  the  symbol  \l (A). 

Using  (B.3),  it  will  be  useful  to  examine  the  behavior  of  the  distribution  using 
varying  values  of  v  and  A.  These  distributions  are  shown  in  Figures  B.l,  B.2,  B.3, 
B.4,  and  B.5.  In  the  next  section,  a  discussion  of  the  reason  for  the  occurrence  of 
this  distribution  will  be  presented.  It  will  become  evident  that  A  is  a  function  of  the 
noise  present  in  the  image  and  will  be  fixed  in  the  applications  presented. 
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Figure  B.l:  Graph  of  the  PDF  of  x'vW  wifi1  A  =  0.  This  dis¬ 
tribution  is  identical  to  that  of  a  central  chi-square  distribution. 
Varying  values  of  u  are  plotted. 


Figure  B.2:  Graph  of  the  PDF  of  yy  (A)  with  A  =  1.  Varying 
values  of  u  are  plotted. 
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Figure  B.3:  Graph  of  the  PDF  of  %^(A)  with  A  =  3.  Varying 
values  of  v  are  plotted. 


Figure  B.4:  Graph  of  the  PDF  of  xl  (A)  with  A  =  6.  Varying 
values  of  v  are  plotted. 
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Figure  B.5:  Graph  of  the  PDF  of  %^(A)  with  A  =  10.  Varying 
values  of  v  are  plotted. 

B.1.1  Occurrence  of  the  Chi-Square  Distribution  in  Image  Processing  Prob¬ 
lems.  The  research  described  in  this  dissertation  includes  development  of  meth¬ 
ods  for  denoising  images  that  rely  on  mean-squared  error  calculations  between  image 
subregions.  When  one  looks  at  the  mean-squared  error  between  identical  diffraction- 
limited  subimages  that  have  additive  Gaussian  noise,  the  PDF  of  the  resulting  mean- 
squared  error  distribution  will  be  a  noncentral  chi-square  distribution.  The  reason 
that  this  distribution  occurs  can  be  explained  using  the  following  logic. 

Say  I  is  defined  as  a  diffraction-limited,  n  x  n  block  of  an  image  where  n  is  a 
positive  integer.  For  notational  simplicity,  ignore  the  location  of  this  block  within  the 
image  and  use  the  variables  u  and  v  to  index  the  individual  pixels  of  this  subimage. 
If  I  is  corrupted  by  zero-mean  Gaussian  noise  so  that  for  each  pixel,  I(u,  v)  where 
u  G  {1, ...,  n}  and  v  G  {1, ...,  n},  then 

D  i,j(u,v)  =  I  (u,v)  +  Q  ij(u,v), 
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where  D ij(u,v)  is  a  subimage  centered  at  ( i,j )  and  is  the  realization  of  the 
Gaussian  noise  within  that  subimage.  In  matrix  notation  this  can  be  described 

Djj  IT  Qij, 

where  is  an  n  x  n  matrix  of  Gaussian  random  variables  and  represents  the 
realization  of  noise  in  this  block.  Within  an  image,  there  may  be  other  n  x  n  blocks 
centered  at  coordinates  (s,  t )  that  satisfy  the  equation 

D«,t  =  I  +  Q  s,t- 

If  these  subimages  exist,  then  AS)t,  the  error  between  and  Dst  can  be  calculated 

ASit  =  D  ij  —  =  (I  +  Q  ij)  —  (I  +  Q  s,t), 

=  Q ij  -  Q u,v  (B.7) 


For  a  single  pixel  in  As  t, 


A a,t(u,v)  =  Q ij(u,v)  -  Q s,t(u,v). 


At  this  point,  note  that  Q itj(u,v)  stays  constant  while  Q s>t(u,v)  varies  over  all  the 
blocks  that  satisfy  the  original  conditions.  Thus  it  is  possible  to  treat  Q ij(u,v)  as 
a  deterministic  value  and  Q sj(u,v)  as  a  Gaussian  random  variable  with  standard 
deviation  er.  The  PDF  of  A s>t(u,v)  can  then  be  written  as 


p(As,t(u,v)\Qij(u,v)) 


1 


y/2lUJ 


exp 


^(A(u,u)  -  Qij(«,»;))2j 


(B.8) 
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For  notational  simplicity  define  the  Frobenius  norm  as  Watkins  does  in  [54]  using 
the  notation  ||A||F  to  be 

(B.9) 

where  A (i,j)  are  the  elements  of  the  n  x  n  matrix  A.  Using  the  Frobenius  norm, 
the  mean-squared  error  can  then  be  calculated  as  the  quantity 

||Asd|i 

MSE  —  s,t"F.  (B.10) 

n2 

where  the  numerator  is  a  sum  over  all  u  and  v  indexing  AS)<  and  the  denominator  is  a 
constant  that  depends  on  the  size  of  the  subimage.  Furthermore,  the  numerator  is  a 
sum  of  the  squares  of  Gaussian  random  variables  of  varying  means.  The  sum  of  these 
Gaussian  random  variables  is,  by  definition,  a  noncentral  chi-square  distribution  [28]. 
The  arrival  at  this  distribution  can  also  be  shown  experimentally  as  shown  in  the 
following  subsection. 

B.1.2  Statistical  Characteristics  of  an  Experimentally  Determined  Distribu¬ 
tion.  Say  there  is  a  single  3x3  portion  of  a  diffraction-limited  image  that  can  be 
represented  as  shown  in  (B.ll).  This  subimage  is  denoted  as  I. 

(  100  120  131  ^ 

1  =  45  190  43  (B.ll) 

v  100  140  100  y 

If  zero-mean,  normally-distributed  noise  with  a  =  25  is  added  to  I.  a  noise  realization 
for  the  block  may  be  obtained  that  is  similar  to  Q itj  shown  in  (B.12). 
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Q  i,j 


(  -10.8141 
-41.6396 
v  3.1333 


7.1919 

-28.6618 

29.7729 


29.7291  ^ 
-0.9408 
8.1823  y 


(B.12) 


Using  the  notation  (  )  to  indicate  a  sample  mean,  this  realization  of  the  noise 
has  a  mean  of  (Q;p)  =  §  S”=i  S”=i  v)  =  — 0.45  and  a  variance  of  a2  = 

507.78. 

At  this  point  it  is  helpful  to  note  that  although  the  distribution  of  the  noise 
has  zero  mean  and  a2  =  625,  the  mean  of  the  realization  of  the  noise  in  the  3x3 
matrix  is  not  zero  and  the  variance  is  less  than  the  variance  that  would  be  observed 
over  a  larger  array  of  numbers.  This  difference  will  become  significant  later  on.  For 
now,  it  is  interesting  to  examine  statistics  of  the  mean  values  of  the  noise. 


B.1.3  Statistics  of  the  Sample  Mean  of  n  x  n  Noise  Samples. 
sized  subimage,  the  mean  of  the  noise  within  the  subimage  is  defined 

(Q) =  ELSiOmM, 

l'°  n 2 

In  this  equation,  Z)"=i  Q*  j(u,v)  represents  the  sum  of  i.i.d.  zero-mean  Gaus¬ 

sian  random  variables.  To  calculate  the  PDF  of  this  mean,  recall  that  the  PDF  of 
the  AWGN  can  be  represented  as 

p(x)  =  ^ew& 

As  noted  in  [35]  the  sum  of  random  variables  with  PDFs  described  by  (B.14)  can  be 
found  as 


(B.14) 


For  an  n  x  n 

(B.13) 


p(u) 


n 


liner 


—  ( nu )2 
2a2 


(B.15) 
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Using  (B.15),  it  can  be  shown  analytically  that  varying  the  size  of  the  n  x  n  subimage 
will  change  the  variance  of  the  mean  of  the  noise.  In  Figure  B.6  this  is  demonstrated 
graphically  using  analytic  data  to  graph  the  mean  of  the  noise  in  blocks  of  varying 
size  where  a  =  2.  It  is  verified  using  randomly  generated  experimental  data  in 
Figure  B.7. 


Figure  B.6:  Analytically  constructed  plot  of  the  PDF  of  the 
mean  of  a  set  of  noise  samples.  Note  that  the  sample  mean  of  a 
set  of  zero-mean  noise  realizations  is  more  likely  to  be  zero  with 
a  larger  block  size. 


B.  1.3.1  Corruption  of  a  Subimage  with  Noise.  Returning  again  use 
to  the  image  model  Dj  J  ~  1  +  Q ij  a  sample  noisy  block  with  AWGN  can  be  created 
as 


D 


(  89.1859  127.1919  160.7291 
3.3604  161.3382  42.0592 

103.1333  169.7729  108.1823  j 


(B.16) 
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Figure  B.7:  Experimentally  constructed  plot  of  the  PDF  of 

the  mean  of  a  set  of  noise  samples.  Data  was  constructed  using 
100,000  ensembles  of  data  with  the  indicated  block  size.  Results 
closely  approximate  those  predicted  analytically  and  shown  in 
Figure  B.6 
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The  mean  of  this  noisy  block  is  107.217  and  its  variance  is  2831.0.  For  reasons 
that  will  become  evident  later  in  this  dissertation,  it  is  desirable  like  to  subtract  the 
mean  from  this  block  and  then  examine  its  mean-squared  error  in  relation  to  other 
blocks  that  have  been  created  through  a  similar  process.  This  process  will  yield  a 
chi-square  random  variable.  If  a  mean-subtracted  block  is  created  F,;j  =  D ij  —  (Djj) 
the  zero-mean  result  is 


h3 


(  -18.0311  19.9749 
I  -103.8566  54.1212 
v  -4.0837  62.5559 


53.5121  ^ 
-65.1579 
0.9653  y 


(B.17) 


This  block  is  now  compared  to  other  blocks  that  are  statistically  similar. 


B.  1.3.2  Generating  Statistically  Similar  Subimages.  Say  that  there 
exist  10,000  blocks  that  are  identical  to  block  I  and  say  that  these  blocks  are  also 
corrupted  by  zero-mean  Gaussian  noise  with  standard  deviation  25.  Using  the  same 
process  that  was  employed  to  construct  F,^,  subtract  the  mean  from  these  blocks 
and  call  these  blocks  GSit  where  s  G  {1, ...,  100}  and  t  G  (1, ...,  100}.  Calling  A (u,v) 
the  per-pixel  error  between  F,  j  and  G where  the  pixels  are  indexed  using  u,  v,  the 
errors  observed  are  shown  in  Figure  B.8. 

Assuming  there  are  enough  samples  available  to  equate  the  sample  statistics 
with  a  true,  underlying  statistical  distribution,  define  the  expected  value  of  the  error 
between  Fjj  and  GhJ  as  AS)t.  Then, 


A  s,t(u,v) 


( 


=>  A 


S,t 


V 


oo 

=  22  ASjt(u,u)p(  ASjt(u,v)) 

AS:t(u,v)=—oo 

-10.3264  7.4875  30.1458  ^ 

-40.8786  -27.8308  -0.841  • 

3.48  30.5208  8.2521  y 


(B.18) 


(B.19) 
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PDF  of  errors  for  pixel  (1,1)  PDF  of  errors  for  pixel  (1 ,2)  PDF  of  errors  for  pixel  (1 ,3) 


error,  E[x]  =  3.48  error,  E[x]  =  30.5208  error,  E[x]  =  8.2521 


Figure  B.8:  Per  pixel  error  between  similarly  constructed  im¬ 
ages. 

These  errors  are  approximately  equal  to  Q i}j  —  (Q itj).  The  expected  value  for  (Qij) 
is  zero,  and  since  a  priori  knowledge  of  Q;  J  is  not  available,  it  is  necessary  to  assume 
(Q i,j)  =  0  and  A s>t(u,v)  =  Q ,j(u,v). 


B.  1.3.3  Estimation  of  Mean-Squared  Error  .  Under  the  aforemen¬ 
tioned  assumptions,  the  PDF  of  the  measured  mean-squared  error  between  I  and  F,  ; 
can  be  predicted.  Since  the  noncentral  chi-square  distribution  is  constructed  from 
normal-variance  Gaussian  random  variables,  in  order  to  use  the  given  equations,  it 
is  necessary  to  normalize  the  distributions.  Examining  the  per-pixel  squared  error 
for  cr2  ^  1  and  v  —  9  the  following  calculation  is  made  for  the  chi-square  random 
variable 


x 


a- 


E 

i=l 


x ; 


cr 


(B.20) 
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The  scaled  noncentrality  parameter  is  found 


4  = 

(J  —  /  ^  (J  ~  '  ^  (J  ~  '  ^ 


(B.21) 


i=l  i= 1  2—1 

The  expected  mean  of  the  scaled  data  is  calculated  from  the  known  values  for  the 
noncentrality  parameter  A  and  the  number  of  degrees  of  freedom  v  to  be 


E[x] 


O 


=  z/  +  A  =  9  +  9  =  18, 


(B.22) 


which  lead  directly  to  the  first  moment  of  the  PDF 


E[x]  =  11,250. 


(B.23) 


The  variance  can  also  be  calculated  from  the  noncentrality  parameter  A  and  the 
number  of  degrees  of  freedom  v  as 

VAR[4l  =  21/  +  4A  =  2(9)  +  4(9)  =  54,  (B.24) 

<7Z 

which  then  leads  to  the  variance 

VAR[x]  =  546252  =  21,093,750. 

To  examine  the  mean-squared  error  of  variables  with  nine  degrees  of  freedom  (cor¬ 
responding  to  the  nine  pixels  in  the  subimage),  divide  the  mean  values  by  9  and  the 
variance  value  by  92  =  81  to  give  the  mean  value  over  each  block,  thus  yielding 

E[x] predicted  =  1,  250,  (B.25) 

VAR  [a;]  predicted  =  260,416. 
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Modifying  the  calculations  to  incorporate  knowledge  of  the  actual  means  then, 


Y.Y. 

U=  1  V=l 


ES!t(u,v) 


=  7.247. 


a 


(B.26) 


and 


E[x]  calculated  —  lj  127,  (B.27) 

VAR  [x\ calculated  =  227,  120. 

Using  randomly  generated  data  as  described  above,  the  measured  values  are 

X) calculated  ~  1)062, 

( X  ) calculated  ~  (X)  calculated  ~  218,220 

Graphically,  the  measured  PDF  vs.  the  predicted  and  calculated  PDFs  are  shown 
below  in  Figure  B.9.  As  shown  graphically,  the  error  distribution  can  be  predicted 
with  reasonable  accuracy  using  only  the  block  size  and  the  know  mean  and  standard 
deviation  of  the  noise. 

B.2  Calculating  the  Covariance  Present  in  linages  and  Image  Projections 

Average  covariances  are  used  in  some  calculations  in  this  dissertation.  This 
section  describes  these  covariances  are  measured  and  the  notation  that  is  used  to 
denote  these  measurements.  In  Chapter  III,  the  covariances  of  image  pixels  are 
calculated  based  on  their  measured  values  in  projections.  Since  the  calculations 
rely  on  information  available  in  the  projections,  out  of  necessity,  these  measures 
of  pixel  covariances  ignore  some  relationships  that  are  evident  in  2-D  that  are  not 
evident  in  1-D  projections.  Consequently,  some  of  the  sample  covariances  measured 
are  better  described  as  average  covariances.  This  averaging  is  acknowledged  using 
overbar  notation  and  expressed  as  C0V2,  z  G  {x,  y}  as  they  occur  where  z  indicates 


(B.28) 
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Figure  B.9:  Comparison  of  measured  vs.  predicted  data  with 
xl (A)  with  A  =  15  distribution.  Data  labeled  “measured  PDF  ” 
is  calculated  according  to  the  conditions  of  (B.29),  data  labelled 
“PDF  predicted  with  knowledge  of  block  noise”  is  calculated 
accoding  to  conditions  of  (B.28)  and  data  labelled  “PDF  pre¬ 
dicted  without  a  priori  knowledge”  is  calculated  according  to 
conditions  of  B.28. 

the  axis  of  the  projection.  For  a  point  indexed  with  x  in  a  projection  define  this 

average  covariance  as 


COVy(I|a) 


N(N-1) 


IN- 1  N—l 

EE 

i  3/1=0  »2=0 

V2^yi 


K>,?/i)I(a;  +  a,y2) 


(I)2.  (B.29) 


In  (B.29),  (  )  indicates  an  average  over  all  x  in  the  fist  term  and  an  average  over  all 
x  and  y  in  the  second  term.  Sample  plots  of  covariances  measured  for  two  images 
are  shown  in  Figures  B.10-B.13. 

When  uncorrelated  noise  is  added  to  the  image,  the  center  of  the  covariance 
plot  (corresponding  to  the  variance  of  the  noise)  increases.  However,  the  effect  on 
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Figure  B.10:  1024  x  1024  grayscale 

image  of  the  Pentagon  image  from 
http://sipi.usc.edu/database/. 


Figure  B.  11:  Measured  covariance  of 
the  column  projections  of  the  image  in 
Figure  B.10.  The  covariance  in  this 
graph  is  calculated  using  circular  shift¬ 
ing. 


the  off-center  values  of  the  covariance  function  is  much  less  pronounced.  This  effect 
is  shown  pictorially  in  Figures  B.14  and  B.15.  In  fact,  the  off-peak  values  are  similar 
enough  to  suggest  that  the  correlation  structure  for  these  images  may  be  estimated 
fairly  accurately  by  low-pass  filtering  the  measured  covariance  function  of  a  noisy 
image  and  estimating  the  peak  using  the  slopes  of  the  points  z  =  —a  through  z  =  +a 
where  a  represents  an  arbitrary  but  small  value.  This  will  allow  the  adaptive  design 
of  a  filter  for  a  given  image  that  will  minimize  registration  errors.  It  may  assist  in 
design  a  optimal  filter  to  denoise  the  image. 
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(zlO)AOO 


Figure  B.12:  256  x  256  grayscale 

aerial  image  of  a  chemical  plant  from 
http://sipi.usc.edu/database/. 


Figure  B.13:  Measured  covariance  of 
the  column  projections  of  the  image  in 
Figure  B.12.  The  covariance  in  this 
graph  is  calculated  using  circular  shift¬ 
ing. 


Figure  B.14:  Measured  covariance  of 
column  projections  of  the  pentagon  im¬ 
age  with  AWGN  of  a  =  100. 
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Figure  B.15:  Measured  covariance  of 
the  column  projections  of  the  image  in 
Figure  B.12  with  AWGN  of  a  =  100. 
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